Java and Clojure examples for reading the new WARC Common Crawl files

Originally published January 26, 2014

I just added a Clojure example to my Common Crawl repo. This Clojure example assumes that you have locally copied a crawl segment file to your laptop. In the next week I will add another Clojure example that pulls segment files from S3.

There are two Java examples in the repo for reading local segment files and from S3.

Comments

Popular posts from this blog

My Dad's work with Robert Oppenheimer and Edward Teller

Time and Attention Fragmentation in Our Digital Lives

I am moving back to the Google platform, less excited by what Apple is offering