Showing posts from 2011

(re) learning Clojure

I have been using Clojure for about 25% of my consulting work in the last 2 years, read two books on Clojure, and I had some Clojure examples in a book I wrote last year. That said, I don't really feel "expert" at the language the way I do with Java, Ruby, and Common Lisp. I am trying to fill in some gaps by carefully reading through one of my customer's Clojure code , and all Clojure libraries that I use like Noir, Compojure, etc. I am trying to pick up more idioms. I enjoy it when I see a new trick in someone else's code and going back to my code to improve it.

Using the New York Times Semantic Web APIs

I am working on a side project of my own in Clojure using the AllegroGraph 4 and Stardog RDF repositories (thanks to Franz and to Clark & Parsia for licenses to use their products!) and my own NLP code. I am using the excellent NYT data access APIs to get research/test data. I am going to show you some simple examples in Ruby for accessing the NYT Semantic Web APIs that are free to use up to 5000 API calls a day. I also use other NYT APIs. Each API has an access key that you need to sign up for. I set my access keys as environment variables that I access in my code; for example in Ruby: # New York Times API Keys: NYT_SEMANTIC_WEB = ENV['NYT_SEMANTIC_WEB'] NYT_SEARCH = ENV['NYT_SEARCH'] NYT_NEWSWIRE = ENV['NYT_NEWSWIRE'] NYT_PEOPLE = ENV['NYT_PEOPLE'] NYT_TAGS = ENV['NYT_TAGS'] In the following code snippets, I am only using the Semantic Web APIs. I want to first search for available concept types and concept names, based on keyword sea

Closer to the metal: Clojure, Noir, and plain old Javascript

I am wrapping up a long term engagement over the next five to six weeks that uses Java EE 6 on the backend, and SmartGWT (like GWT, but with very nice commercially supported components) clients. As I have time, I am starting up some new work that uses Clojure and Noir, and it is like a breath of fresh air: I keep a repl open on the lein project and also separately run the web app so any file changes (including the Javascript in the project) are immediately reflected in the app. Such a nice development environment that I don't even think about it while I am working, and maybe that is the point! As I have mentioned in previous blog posts, I really like the Clojure Noir web framework that builds on several other excellent projects. Developing in Noir is a lot like using the Ruby Sinatra framework: handles routes, template support options, but it is largely roll your own environment .

Ruby Sinatra web apps with background work threads

In Java-land, I have often used the pattern of writing a servlet with an init() method that starts up one or more background work threads. Then while my web application is handling HTTP requests the background threads can be doing work like fetching RSS feeds for display in the web app, perform periodic maintenance like flushing old data from a database, etc. This is a simple pattern that is robust and easy to implement with a few extra lines of Java code and an extra servlet definition in a web.xml file. In Ruby-land this pattern is even simpler to implement: require 'rubygems' require 'sinatra' $sum = 0 do # trivial example work thread while true do sleep 0.12 $sum += 1 end end get '/' do "Testing background work thread: sum is #{$sum}" end While the main thread is waiting for HTTP requests the background thread can do any other work. This works fine with Ruby 1.8.7 or any 1.9.*, but I would run this in JRuby for a lon

Using the Stardog RDF datastore from JRuby

I was playing with the latest Stardog release during lunch - the way to quickly get going with the included Java examples is to create a project (I use IntelliJ, but use your favorite Java IDE) and include all JAR files in lib/ (included all nested directories) and the source under examples/src . 6/21/2012 note: I just tried these code snippets with the released version 1.0 of Stardog and the APIs have changed. I took the first Java example class ConnectionAPIExample and converted the RDF loading and query part to JRuby (strange formatting to get it to fit the page width): require 'java' Dir.glob("lib/**.jar").each do |fname| require fname end setupSingletonSecurityManager() com.clarkparsia.stardog.StardogDBMS.get(). createMemory("test") CONN = com.clarkparsia.stardog.api."test").connect() CONN.begin() CONN.add().io().format(

Experimenting with Google Cloud SQL

I received a beta invite today and had some time to read the documentation and start experimenting with it tonight. First, the best thing about Google Cloud SQL: when you create an instance you can specify more than one AppEngine application instances that can use it. This should give developers a lot of flexibility for coordinating multiple deployed applications that are in an application family. I think that this is a big deal! Another interesting thing is that you are allowed some access to the database from outside the AppEngine infrastructure. You are limited to 5 external queries per second but that does offer some coordination with other applications hosted on other platforms or host providers. Their cloud SQL service is free during beta. It will be interesting to see what the cost will be for different SQL instance types. It was very simple getting the example Java app built and deployed. I created a separate SQL instance (these are separate from other deployed AppEng

The quality of new programming languages is apparent by looking at projects using the language

The community growing around the Clojure language is great. While the Clojure platform is still evolving (quickly!) browsing through available libraries, frameworks, and complete projects is amazing. My "latest" favorite Clojure project is Noir that simply provides a composable mechanism for building web applications (using defpartial ). I get to use Noir on two customer web app projects (and some work with HBase + Clojure) over the next month or two, and I am looking forward to that. The simpler of the two web apps is an admin console exposing some APIs on a private LAN and the Try Clojure web app is a great starting point, as well as an example of a nicely laid out Noir application. Since Clojure is such a concise language I find it easy to read through, understand, evaluate, and use projects. Since I am still learning Clojure (I have just used Clojure for about 6 months of paid work over the last couple of years) the time spent reading a lot of available code to find

Writing a simple SQL data source for the free LGPL version of SmartGWT

While travelling back from a vacation I cleaned up some old experimental code for writing a fairly generic SmartGWT data source with the required server side support code. The commercial versions of SmartGWT have support for connecting client side grid and other components to server side databases. For the free version of SmartGWT you have to roll your own and in this post I'll show you a simple way to do this that should get you started. Copy the sample web app that is included in the free LGPL version of SmartGWT and make the modifications listed below. I also set up a Github project that contains everything ready to run in IntelliJ. The goal is to support defining client side grids connected to a database using a simple SQL statement to fetch the required data using a custom class SqlDS . I had to strangely format the following code snippets to get them to fit the content width for my blog: ListGrid listGrid = new ListGrid(); listGrid.setDataSource( new SqlDS

Annoyed by anti-MongoDB post on HN

I am not going to link to this article - no point in giving it more attention. The anonymous post claimed data loss and basic disaster using MongoDB. I call bullshit on this anonymous rant. Why was it posted anonymously? I am sitting in an airport waiting to fly home right now: just finished extending a Java+MongoDB+GWT app and I am starting to do more work on a project using Clojure+Noir+MongoDB. I do have a short checklist for using MongoDB: For each write operation I decide if I can use the default write and forget option or slightly slow down the write operation by checking CommandResult cr = db.getLastError(); - every write operation can be fine tuned based on the cost of losing data. I usually give up a little performance for data robustness unless data can be lost with minimal business cost. I usually use the journalling option. Use replica pairs or a slave. I favor using MongoDB for rapid prototyping and research. I use the right tool for each job. PostgreSQL, variou

Notes on converting an GWT + AppEngine web app using Objectify to a plain GWT + MongoDB web app

There has been a lot of noise in blog-space criticizing Google for the re-pricing of AppEngine services. I don't really agree with a lot of the complaints because it seems fair for Google to charge enough to make AppEngine a long term viable business. That said, I have never done any customer work targeting the AppEngine platform because no one has requested it. (Although I have enthusiastically used AppEngine for some of my own projects and I have written several AppEngine and Wave specific articles.) I still host on AppEngine. I wrote a GWT + AppEngine app for my own use about a year ago, and since I always have at least one EC2 instance running for my own experiments and development work I decided to move my app. It turns out that converting my app is fairly easy using these steps: Copy my IntelliJ project, renaming it and removing AppEngine facets and libraires. Add the MongoDB Java required JARs I had all of my Objectify datastore operations in a si

Recent evaluations of web frameworks while on vacation

My wife Carol and I have been visiting family in Rhode Island this week and since our grandkids are in school on weekdays, I have had a lot of time to spend writing the fourth edition of my Java AI book and also catching up on reevaluating web frameworks. Although my main skill sets are in data/text mining, general artificial intelligence work and Java server side development, I do find myself spending a lot of time also writing web applications. In the last few years, I have done a lot of work with Rails (and some Sinatra), GWT, and most recently with SmartGWT because one of my customers really liked SmartGWT's widgets . (Note: if you are in the San Jose area and want to work on a SmartGWT project with me, please email me!) For my own use, because I have strong Java and Ruby skills, the combination of Rails, GWT, and SmartGWT works very well for me when I need to write a web app. That said, I have spent time this week playing with Google's Closure Javascript tools and l

Anyone know any SmartGWT and Java developers looking for a job?

A call out for some help: one of my favorite customers is looking for a SmartGWT and Java developer in San Jose area - anyone know anyone good + available?

Common Lisp example code for my Semantic Web book is now LGPL licensed

A few days ago I re-released the Java, JRuby, Clojure, and Scala example code for my JVM languages edition of my Semantic Web book under the LGPL. I just did the same thing today for the Common Lisp edition of this book: Github repository

Changed license from AGPLv3 to LGPLv3 for example code in my book "Practical Semantic Web and Linked Data Applications, Java, Scala, Clojure, and JRuby Edition"

Here is the github repository  for the source code and all required libraries. My open content web page" where you can download a free PDF for my book or follow the link to Lulu to buy a print version. Enjoy!

Semantic Web, Web 3.0, and composable systems

I really enjoyed Steve Yegge's long post last week about the shortcomings of Google's architecture. Google provides great services that I use every day but building systems as Amazon does of composable web services (AWS) to build more complex products and services seems like a better approach. I have been experimenting with Semantic Web (SW) technologies since reading Tim Berners-Lee, James Hendler, and Ora Lassila's 2001 Scientific American article. I have not often had customer interest in using Semantic Web technologies and I think that I am starting to understand why people miss the value-add: Just as AWS provides composable web services SW helps information providers to provide structured and semantically meaningful data to customers and users who decide what information to fetch, as they need it. These consumers of SW data sources must have a much higher skill set to build automated systems compared to a user of the web who manually navigates around the web to fi

A letter to my friends and family: the death of American democracy: not dying, but already dead

Hello family and friends, Democracy in our country is dead, but you would not know it from reading the highly censored corporate-owned and controlled "news"/propaganda media. If you look to foreign news or youtube or the general Internet or talk to friends in foreign countries that have a free press, you will understand that is is not rank and file cops, but their supervisors commiting what I think can only be called illegal brutality against the "occupy" movement. The high-ranking police do this because they are ordered by their puppet-masters to do so. There is a huge disparity between what the general public wants and what the corporate lackeys in Congress and the corporate lackey Obama (following in the ubber corporate lackey W.Bush's footsteps) do. As Warren Buffet said in a recent interview, the USA is now a plutocracy, and that is a shame. Good writeup on a writer's arrest: I enjoy Naomi Wolf's work - a writer with reasonable views. Our

Appreciating Steve Jobs and the people taking part in "Occupy Wall Street"

First: my condolences to Steve Job's family and friends. He was an awesome guy who lived on his own terms and made the world a better place by doing things that he loved and was proud of. I would also like to give a shout out of appreciation to the broad spectrum of Americans who are taking part in "Occupy Wall Street." They are facing state sponsored brutality: the elite class doesn't like the legal protests so they put pressure on the government and government influences police to do things that in their hearts they know are not right. I have been reading a lot of strong criticism of the police for their brutality in New York City against mostly peaceful American citizens exercising their first amendment rights - I personally try to not blame the police because I think it is more accurate to blame the people who control them. There are shocking videos on youtube of police brutality against US citizens in New York City during these protests and I thought about putt

Experimenting with Clojure 1.3 and Noir 1.2

Noir is a Clojure "mini framework" that is built on top of Compojure. Chris Granger released a new version today that is updated for Clojure 1.3. After working mostly in Clojure last year but using Clojure not very much this year (lots of work for a Java shop) I decided to check out both Clojure 1.3 and Noir 1.2 this afternoon - and I liked what I saw. The Noir example application uses a recent version of clj-stacktrace and stack traces are much better: Noir prints a well formatted stack trace on any generated web page if an error occurs. This stack trace is very good, filtering out information that you really don't want to see, identifying where the error occurred, and with usually a useful error message. This eliminates the only major complaint I have ever had with Clojure. Very cool! The Noir web site had a link to an article written by Ignacio Thayer on running a Clojure Noir MongoDB app on Heroku, using a free MongoDB account. Worked great. I made a trivial c

JPA 2 is the only part of Java EE 6 that I like a lot - how it compares to ActiveRecord

First, in Ruby-land: I am a huge fan of both Datamapper and ActiveRecord. Here I am only going to talk about ActiveRecord because it is freshest for me because I have been reading through a few Rails specific books that Obie Fernandez's publisher Addison-Wesley sent me review copies of earlier this year: these books use ActiveRecord 3.*. Recently I created two small throw-away learning apps using Rails 3.1 to kick the tires on new features and I used ActiveRecord for each. One of my customers is a Java EE 6 shop (although we now do use SmartGWT for web apps) and I have been using JPA 2 (Hibernate provider) a lot. In Java-land, I can't imagine using anything else to access relational databases unless you want to use the Hibernate APIs directly, and I would not be inclined to walk that path. I used to approach object modeling and design differently in Ruby: I would usually start with a relational database and use ActiveRecord's (fairly) automatically generated wrapping AP

Finally saw movie "Crazy Heart" - thinking that an AI could write country western songs

Inspired by James Meehan’s Tale-Spin program and thesis, I have to say: writing an AI program to write country western music seems possible. Prof$t!

For work I have been using GWT/SmartGWT. For fun: Seaside and Pharo

I have been enjoying working on two customer web apps written in SmartGWT. SmartGWT is built on Google's GWT with the addition of Isomorphic's smart client library that implements very nice data grids and other UI components and also has good support for wiring rich clients to data sources. Still, there is a lot of ceremony involved in GWT and SmartGWT development so I would recommend these technologies for large projects. For me this ceremony and large learning curve is well worth it because I like coding both rich client and server side components in Java in one development environment (IntelliJ). For side projects that require a web UI I like using both Play! and Rails (and a lot of Rails development work in the last 3 or 4 years). Just recently as another side learning project I have been revisiting the Seaside continuation based web framework for Smalltalk. This week I bought the PDF version of "Dynamic Web Development with Seaside" and when I get bits of fr

Google+ Developer APIs

I received an email from Google today about the release of APIs to access public data . I looked at the Ruby example (a Sinatra App) and the Java JSP example during lunch. Looks like a good kick-start for using public Google+ data in our own web apps. There are also examples for other languages. If I have time this weekend I would like to try deploying an app of my own.

Getting set up to work on the 4th edition of my Java Artificial Intelligence book

For the 3rd edition, I used Eclipse for both development of the Java examples and to prepare the Latex manuscript (using the Eclipse Latex plugin TeXlipse). I really prefer using IntelliJ for Java development and TeXShop (Mac OS X only) for editing Latex files so I just converted my writing setup. I have a fairly good idea of what new topics I want to cover but I am still deciding what material from the 3rd edition I want to remove. Since I released the 3rd edition over three years ago, I have averaged about 300 downloads of the free PDF version a day with a few sales of the print edition each month. I like making a free version available for people to read and generating traffic for my web site where I advertise my consulting services is great. Unfortunately, in the last couple of years when I see my book in search results it is very often on someone else's web site which violates the conditions of the Creative Commons non-commercial use license for the PDF version of my boo

Changing the way we use the Internet

Unless searching for online docs, looking up error codes and error messages, etc., I do relatively little web search and browsing anymore - compared to even a year ago. I usually rely on good links from Twitter and Google+ to find things worth reading, keep up with new tech, and sometimes even read the news. In the 1980s, I was a "find useful stuff at public FTP sites" resource at SAIC. I spent time maintaining lists of useful FTP sites and what they contained so I could help people quickly find stuff. Gopher was a step up. Good search engines were a huge improvement for finding stuff on the web. Now I find myself mostly depending on what interesting people recommend. Even though I am a techie and don't represent a typical Web user, I still think that the trend of using social media to find interesting (and even useful!) material is widespread. It will be interesting to see how the major web companies like Google, Amazon, Microsoft, Yahoo, etc. perform financially in

What I have been working on lately

It has been a while since I blogged. Carol and I are leaving soon on a driving trip with our grandkids, daughter and son in law - it was requested that we both leave our laptops at home, a request that we are planning to honor. So, since I will be without a computer for about 10 days, here is a quick catch-up on what I have been doing: I have been fairly busy lately working for two customers. At a friends company we are writing rich client web applications using SmartGWT and Java EE 6 on the backend. I have also been working for Compass Labs on building a large graph database and also doing some data mining of Freebase data. Google bought Freebase last year and is working on new features and APIs. I am in a private beta for their new Freebase APIs - good stuff!

Second edition of "Semantic Web for the Working Ontologist"

First thanks to Morgan Kaufman Publishers for sending me a copy of the second edition. Dean Allenmang and Jim Hendler did a great job of updating the examples and fixing a few small glitches from the first edition. I have been extremely busy with work so I have only been able to spend about 90 minutes so-far with the second edition but I hope to give it a careful reading when I am on vacation in a few weeks. This book is an excellent guide for anyone who wants to invest a fair amount of time to learn how to write semantic web enabled applications: a very comprehensive book. I recently bought a eBook copy of Bob DuCharme's short book Learning SPARQL Querying and Updating with SPARQL 1.1 that I would also like to recommend as a fairly easy introduction to using SPARQL repositories and generally using SPARQL in applications. If two good new semantic web books were not enough, I have also had fun recently experimenting with Clark & Parsia's new Stardog RDF datastore

Working on a new GWT application for a personal project

I have been using SmartGWT a lot for a customer's project so I have been generally digging into both the Google Web Toolkit (GWT) and the SmartGWT system that builds on top of GWT. My personal project is a reimplementation of my Rails application for ( very old placeholder page ) in GWT. I plan on writing a few long blog articles about GWT and/or SmartGWT in the near future but for now I have learned a few things that are worth sharing: Plan ahead on designing data models that essentially live on the client side, that support caching on the client side, and efficiently support asynchronous data fetches from the server. The view pages (written in Java, compiled to compact and efficient Javascript) must always make asynchronous calls to the local (in the browser) data model because it is unknown whether the local model has the data or must itself fetch asynchronously from the server. Stop thinking about session data in the same way that you do in an old style we

Apache Google Wave in a Box project is starting to look good

It has been a long time since I tried to run the wave protocol stuff. I just grabbed the latest source and followed these directions on my MacBook. I was quickly up and running with no problems. The web UI is a lot simpler than Wave but looks similar. Wave in a Box is looking good as a development platform - something to customize for your organization. I used two browsers, Chrome and Safari to create two test accounts, and as I expected, the real time messaging, etc. worked fine.

Google+ seems to be very well done

Thanks to Marc Chung for the Google+ invite. The web UI is very slick and I know I am going to have fun with g+. Anyone on Google+, let me know if you want to be in my Artificial Intelligence, Clojure, and/or Ruby groups.

I am using SmartGWT on two projects

My currently largest customer uses the commercial version of Isomorphic's Smart GWT and I have spent a few evenings working on one of my own projects (that I will probably open source when it is done) that uses the free LGPL version. 7/9/2011 edit: I ended up re-writing this in straight-up GWT. I should also mention that I have signed a consulting agreement with Isomorphic (svn commit rights :-) but I have not had time in my schedule to do any work for them yet. SmartGWT uses Google's Java to Javascript compiler but instead of using the standard GWT UI components it uses Isomorphic's SmartClient Javascript library (suitably wrapped for extending GWT). The commercial version's sweet spot is reasonably easy integration with server side data sources like relational databases, JSON web services, etc. The free LGPL version provides an example of a client side data source that you can hook up with custom code to web services that you write yourself. For my for-fun side pr

Prelude to learning Clojure and Scala: learn some Haskell

I worked through part of the "Real World Haskell" book a few years ago, but settled on mostly using Clojure as a functional language, with some Scala also. I bought "Learn You a Haskell for Great Good!" this week and I have been enjoying the gentle approach to learning the language. Miran Lipovańća did a good job writing this book. ("Real World Haskell" is also excellent.) One thing that occurred to me is that since Clojure and Scala borrow so many good ideas from Haskell that learning some Haskell before diving into either Clojure or Scala might be a good idea.

Largest public SPARQL endpoint:

The Sindice project has transitioned from a university and consortium project of DERI to a commercial company. Check out the SPARQL endpoint web form - very impressive. During lunch I tried using the Sindice Java client library and it was easy to use but does not for some reason support direct SPARQL queries.

Programmer study time

I love both of my jobs (programming and writing) as long as I don't overdo it and take a lot of time off for other activities like hiking, kayaking, playing musical instruments, and cooking. I have another "down time" activity that is both fun and relaxing for me: studying things that help me with my jobs. For example, when I first adopted Ruby as my primary scripting language and also started developing using Rails, I spent a lot of time reading through the C implementation of Ruby, the Ruby libraries, and the Rails source code. I find this kind of study relaxing because there are no deliverables and things learned studying the implementation of tools I use really pays off in increased productivity and learning new programming idioms and techniques. I used to base a lot of my work on the Tomcat server and ten years ago I made a real effort to understand its implementation. When I was very young I worked as a systems programmer and kept source listings of interesting pa

Polyglot programmers: setting up multi-language access to data

For those of us who tend to use several programming languages doing some up front work to make sure that we have access to all required data stores in the languages we use then makes it possible to really pick the best language for each task. I have a side project that I have been coding on for over ten years, with lots of code in Common Lisp, Scheme, Java and Ruby. If I can ever get a long enough break from my consulting business I have a few good ideas how to monetize some of this work. I was working on this project during a recent cross-country flight and I started adding a bit of new functionality in Ruby, switched to Common Lisp, and the implementation was a bit easier. I needed access to a PostgreSQL database (using PostgreSQL's built in text indexing and search) I use and once I had Internet access after the flight (for some quick reference), I worked out the few lines of code to interface with my annotated news data collection. Here is a small code snippet in case you need

Text search in SimpleDB: a Ruby example

You might want to use SimpleDB for storage and to support text indexing and search if you did not want to manually run and administer Solr yourself. Here is a little snippet that shows how to store searchable documents in SimpleDB: require 'rubygems' require 'aws_sdb' SERVICE = # assuming that this domain is already created DOMAIN = "some_test_domain_7854854" class Document def initialize name, text words = (name + ' ' + text).downcase.split.uniq attributes = {:words => words, :text => text} SERVICE.put_attributes(DOMAIN, name, attributes) end def query # The last inject takes the intersection and # insures that all search terms are present: keys = query.downcase.split.collect {|x| SERVICE.query(DOMAIN, "['words' starts-with '#{x}']")[0] }.inject {|x, y| x & y } keys.collect {|key| SERVICE.get_

And the best JVM replacement language for Java is: Java?

Although I use Ruby (mostly Rails) and Common Lisp on many customer projects, I am heavily invested in the Java platform and I don't see that changing in the next ten years or so. Java is more than a little heavy on ceremony however, and I would like a really agile language for the JVM. I have used Clojure a lot in the last year for work on one customer's project but at least for now the lack of concise and useful runtime error backtraces kills some of the joy of using Clojure. Really nice language and community however, and I expect in a few years Clojure may be my primary JVM language. I love coding in Ruby and the JRuby developers do a great job moving the sub-platform forward. However, except for large Rails applications, I don't see myself writing very large applications in Ruby: for me Ruby is a scripting language for getting stuff done quickly and easily. I do like Scala but the learning curve is steep and that means that it is difficult to find pre-trained highly

Some new Platform as a Service providers: and

I am on vacation so I have not had much chance to try the beta invites I just received for and but both look promising as works in progress. For now, Cloud Foundry is set up for Ruby Rack applications (like Rails and Sinatra) and Java Spring apps. They currently support MongoDB, MySQL and Redis. They will release the core software if you want to run a cloud on your own servers. Dotcloud supports a wide range of platforms and data stores. Their roadmap shows what is available right now and what is planned. Both beta programs are free for now. It will be interesting to see what the costs are.

(Roughly) comparing Play! version 1.2 with Rails

Both the Play! and Rails frameworks implement MVC and have very agile development environments. Play!, being written in Java (but also supporting Scala development) accomplishes this agility by using the Eclipse incremental Java compiler so if you edit any Java code or HTML template files (with embedded Java/Groovy expressions) you immediately see the results after refreshing your web browser. While Play! is not nearly as complete of a stack as Rails, it does include modules for MongoDB AppEngine Objectify GWT Search PDF generation of any view Scala use CoffeeScript OpenAuth working with Google, Yahoo, Twitter, etc. Simple CRUD scaffolding Facebook Connect and Graph API Lucene search of JPA models etc. I have several years of Rails experience and I am using Java EE 6 for a customer project. With this background, I put Play! in the sweet spot between Java EE 6 and Rails: easy to learn if you know Java and supports agile development. My favorite part of Java EE 6 is JPA,

Amazon Cloud Player: make sure you take advantage of their introductory offer

I just purchased an MP3 album "Johnny Winter And / Live" for $5 and got a $20 one year upgrade of 20 GB of cloud storage - a sweet deal, but considering that you always get 5 GB free this may not be much of an added value. Amazon has a nifty uploader application that looked at my iTunes MP3s and playlists and is cloning that on Amazon Cloud Player automatically. My entire iTunes library will only take up a few GBs after it is automatically uploaded. Sometimes Amazon kills Apple's iTunes store on price: I was about to buy a few tracks on iTunes last year and then realized I could buy the entire album as MP3 on Amazon for for not much more. Amazon seems to be investing in introductory offers like the upgrade for Cloud Player and the first time AWS developer's package (basically free to develop and deploy for one year). Certainly expensive for them to provide as free services but Amazon is playing the long game. My ordered list of the most impressive technology compani

The Cloud, The Cloud

Redit was down for over 5 hours last week because of problems with EBS volumes on AWS. Netflix was down a few hours today: another AWS user but I don't know yet what the difficulties are. So Amazon has problems and the web is full of people complaining about occasional problems with Google's AppEngine cloud hosting service. No one likes to not support users 24x7 and the users don't like interrupted service, but I think that these occasional outages are just growing pains as we move towards a new way to deploy applications that costs less money, requires fewer staff resources, and likely is more energy efficient. I think that I have only had one customer in 3 years who did not at least partially deploy on Amazon's AWS. This is the future and we need to learn how to work around problems and take advantage of resource savings when we can.


I read this morning about a new beta book from O'Reilly Up and Running With Node . Before skimming through the beta book, I took a half hour to review the standard Node.js Manual & Documentation . The beta book is very nice (recommended!) and I especially enjoyed the discusion about using the Node REPL. I was lucky enough to have received a Heroku Node beta invitation last year but so far all I have done with Node is to build new versions every few months and play with the examples in the documentation. I think that this is worth the time, even though I have a very full consulting schedule, because Javascript is probably going to become popular for server side development as it already is for browser development. This will happen because of the efficiency of V8, the ability to share code between server and client side, and the inherent scalability of event, rather than multiple thread systems.

David Rumelhart passed away. RIP to a good guy.

Before David won a MacArthur Grant he was a professor at UCSD and co-wrote a few great books on artificial neural networks that really helped me a lot. My company also hired him as a consultant and he gave me advice when I implemented the 12 currently popular neural network learning and recall algorithms in the 1980s for our ANSim software product. It was a great experience sitting in his living room talking about NNs. He was a nice guy and I am sure that he helped many people with his work.

MongoDB 1.8 released - and there is joy throughout the land

I have been running 1.7.5 on my laptop and just upgraded to 1.8 stable. While the normal way to run MongoDB (at least in my work) is to use read-only slaves for analytics, etc., I am still glad to see the single server robustness changes, including optional journaling. I also noticed that in the admin shell, showing databases provides database size estimates. Another useful change is replica set authentication using identical key files that are placed on each server. You then let one server know about the others (as before). You can read about other improvements here .

Nourish and manage your career, not your job

I have been working close to full time since the beginning of this year for two customers. This is unusual for me since I have usually capped my work week at 32 hours maximum over the last 25 years. I have been enjoying the work and extra earnings but I believe that working too many hours carries a real cost and some risks: It is far more important to manage our own careers than any particular job. You don't own your job but you do own your career. Just as you maintain your home and your car, careers require fairly much constant maintenance, including: Life long learning of new technical skills. Developing skills that enable people you deal with to also be successful: strive for win-win outcomes. Networking that supports finding work, getting second opinions on important decisions, and leads on new interesting and useful technologies. Time for self analysis: what has worked for you in your career (and life!), possible improvements, and understanding situations and attitudes t

Don't stray too far from well supported language, tool and platform combinations

I have been doing a lot of customer work lately using Java EE 6 and Glassfish. Fairly nice development environment even if Java EE 6 is heavy on ceremony (but better than J2EE). Just for fun today I took small play apps I had written in Rails and Clojure + Compojure and shoe-horned them to run on Glassfish. Interesting exercise, but unless there is an overwhelming need to create a custom deployment setup, it is so much better to go with the flow and stick with well crafted and mature setups like Java and Java EE 6 Clojure + Compojure running with embedded Jetty behind nginx for serving static assets Rails app hosted on Heroku Web apps written in either Java or Python hosted on AppEngin using the supported frameworks etc. I must admit that I enjoy hacking not only code but also deployment schemes - enjoy it too much sometimes. Sometimes it is worthwhile, most often not.

Curated data

It is difficult to predict what data will have long term value so it is often safest to archive everything. With data storage costs approaching zero I think that we can expect high value data to last forever, baring a nuclear war or the crash of society. Curated data has a higher value than saving "everything." I think that the search engine Blekko is interesting and useful because of what it does not have: human powered curation yields fewer results but very little SPAM. The Guardian 's curated structured data stores have much higher value than the original raw data (from government sources, etc.). I can imagine The Guardian curated data becoming a permanent part of our history as for example are ancient stone tablets we see in museums. I have long planned on providing curated news and technology data that has semantic markup either on my ancient domain or a new placeholder but I seldom have free time slots because of my consultin

Big Data

For the last few decades, it seems like I work on a project every few years that stretches my expectations of how much data can be effectively processed (starting in the 1980s: processing world-wide seismic to detect underground nuclear explosions, all credit card phone calls to detect theft, etc.) I was in meetings three days last week with a new customer and as we talked about their current needs I made mental notes of what information they are probably not capturing that they should because it is likely to be valuable in the future in ways that are difficult to predict. To generalize a bit, every customer interaction with a company's web sites should be captured (including navigation trails through web sites to model what people are personally interested in), every interaction with support staff, every purchase and return, etc. Amazon has set a high standard in user modeling with Amazon suggests for products that you might want to buy. Collecting data on your customers sh

Two good books on AppEngine development

The publisher PACKT sent me a review copy of Google App Engine Java and GWT Application Development by my friend (via email correspondence) Daniel Guermeur and co-author Amy Unruh (thanks!) and I bought Code in the Cloud Programming Google AppEngine by Mark C. Chu-Carroll. Both are very good books and complement each other. Mark's book gives an interesting insite into AppEngine from someone who works at Google. He covers both Python and Java development. I relied on the Python sections of the book when I wrote a Python based AppEngine application last December. I am not much of a Python programmer but his book got me going quickly and I had few problems. He uses Google's Django support for AppEngine for the Python examples and Google Widget Toolkit (GWT) for the Java examples. Daniel's and Amy's book is a hands-on guide to using the Eclipse IDE and GWT to develop AppEngine applications in Java. They use JDO for the book examples which is probably best for the ge

Social networking: why fewer connections may be better

I have a very public web presence from this blog and my web site. I enjoy sharing information and communicating with people via email and occasionally (with a heads-up email first) talking on the telephone. I also spend an hour a week giving free advice to students on their projects, employment hints, and to a more limited degree give feedback on technical ideas. I can enjoy doing this because email is asynchronous: I can handle these interactions when they don't interfere with my work or research. In the past I have accepted connections on LinkedIn and Facebook from people who I don't know, just t be friendly. However, there is a cost to this. LinkedIn frequently sends out email statuses of what colleagues (current and present) are doing. I like this for people I know well either personally or through years of email interactions. However, status updates from people I am not closely associated with take time even to ignore. The situation is worse on Facebook. I used to a

Java EE 6 is actually pretty good

I do most of my development in very agile programming environments like Ruby and Rails (with Datamapper or ActiveRecord), Clojure with MongoDB, etc. I like languages that have an interactive repl. Recently, I have taken on for a new customer helping (a life-long friend's company) do some conversion to Java EE6 and I must say that Java EE 6 is very well done. I wrote a J2EE book many years ago and used to be into Java server side development but drifted to other platforms partly because that is what customers hired me to work on and partly because of my own technical interests. In the last 5 years I have probably spent about 30% of my time working with Java (customers want Lisp and Ruby development). Java EE 6 is so much better than J2EE. Writing POJOs (with EJB annotations), unit testing them as simple POJOs, and then integration testing them in an EJB container makes for a reasonable programming environment. I still think that you get much more bang for your programming buck w

Recommended: Niall Ferguson's "The Ascent of Money"

I just finished watching tonight the DVD of the PBS series but the book covers the same material. Harvard professor/historian Nial Ferguson puts down and in its place the (in my opinion also) misguided view that governments can, long term, spend their ways out of problems. The PBS series is especially fun to watch (in addition to being very educational) because as Ferguson traces "bubbles" throughout history, there is local video that helped me picture what happened in ancient, medieval, and modern times. Worth watching! (There are many excerpts on youtube if you don't have time to watch the 4 hour series.) I rented the DVD from Netflix, and their summary is good: British historian and author Niall Ferguson explains how big money works today as well as the causes of and solutions to economic catastrophes in this extended version The Ascent of Money documentary. Through interviews with top experts, such as former Federal Reserve Chairman Paul Volcker and American cur

Happy New Year

I start each day by enumerating the good things in my life (computer scientist-speak for "counting my blessings") followed by meditation and relaxation techniques. A nice way to start each day. I would like to do the same today, the first day of 2011 (note that 2011 is the sum of consecutive primes: 157+163+167+173+179+181+191+193+197+199+211). I am grateful for my family and friends, living in one of the most beautiful places in the world ( Sedona Arizona ), having interesting work with great customers, resources to self-fund my own research as a computer scientist, and time to enjoy my hobbies (cooking, hiking and reading). Happy New Year!