Open data sources like Metaweb, Wikipedia, and SEC Edgar database

July 28, 2008

I just read a few month old blog by Toby Segaran (author of the very useful book Programming Collective Intelligence) on link information for shared board of directors members between large corporations. Many years ago I did something similar from combined CIA Factbook and SEC Edgar data and I still have a SQL dump file on my Open Source web page.

Since Toby works at Metaweb he fetched the corporate director link data from Metaweb (Freebase). Freebase sets a high standard for the ease of finding and extracting information. Other sources like Wikipedia (via custom web scraping or fetching their entire database) or the RDF extraction of Wikipedia (DBpedia) are not as simple to use, but still useful.

I have a long history of organizing and cataloging information, starting in the 1980s at SAIC. Back in the pre-gopher days, I used to maintain lists (as plain text files) of where to find useful tools and information on FTP sites on the Internet and when someone would ask me where to find something then I would grep my own lists. Things have improved a lot since then :-)

I just finished the rough draft for an article on the Semantic Web this morning. Although standards like RDF/RDFS/OWL/SPARQL are very useful, I expect the Semantic Web to also have a strong ad hoc component. However ad hoc information sources may have standard interfaces built for them (E.g., SPARQL end points, etc.)

Search This Blog

Open data sources like Metaweb, Wikipedia, and SEC Edgar database

Comments

Post a Comment

Popular posts from this blog

AI update: The new Deepseek-R1 reasoning language model, Bytedance's Trae IDE, and my new book

Wonderful book: "Land of Lisp" - Conrad Barski is a great author and communicator

I am moving back to the Google platform, less excited by what Apple is offering

Clojure vs. Scala smackdown

Nice: OpenCyc version 4.0 has been released

Ruby Sinatra web apps with background work threads

Writing a simple SQL data source for the free LGPL version of SmartGWT

Small example app using Ember.js and Node.js

Using the Datomic free edition in a lein based project

And the best JVM replacement language for Java is: Java?

Comparing Clojure + Clojurescript with Scala + Scala.js

Happy New Year

History in the making: first Lee Sedol vs. AlphaGo match game