Java and Clojure examples for reading the new WARC Common Crawl files

Originally published January 26, 2014

I just added a Clojure example to my Common Crawl repo. This Clojure example assumes that you have locally copied a crawl segment file to your laptop. In the next week I will add another Clojure example that pulls segment files from S3.

There are two Java examples in the repo for reading local segment files and from S3.

Comments

Popular posts from this blog

Ruby Sinatra web apps with background work threads

My Dad's work with Robert Oppenheimer and Edward Teller

Time and Attention Fragmentation in Our Digital Lives