Showing posts from October, 2005

Great Java open source project: Nutch search engine

Nutch is an Apache licensed open source search engine project that I have been keeping an eye on for a while. One thing that makes this project especially compelling is that the author of the (fabulous) Lucene search library Doug Cutting is also a principle designer and implementer of Nutch. You can grab the source code using subversion: svn co Nutch now contains two new modules: the Nutch Distributed File System (patterned after the Google File System) and a Java version of MapReduce (patterned after Google's MapReduce). So far, I have only been looking at the source code (no builds and playing with it yet!) but this stuff looks really good. Anyone want to start a search engine company? :-)

Good update: RadRails 0.4 released today

RadRails version 0.4 was released today. It is a good upgrade: RHTML files are now color syntax hilighted (but with a default black background color that can be easily changed - what is up with that :-) The code assist support is also useful.

Efficiency (or lack of) in Java reflection: glad we have it anyway

It is good that the java.lang.reflect package exists. This interesting read (a PDF file on shows a great example of using a decorator/proxy/delegation pattern for implementing a logging property across multiple classes. Neat stuff and I like this better than, for example, using Aspect Oriented Programming (AOP) to get the same effect: add logging without touching original POJO classes. However, Java reflection is not very efficient and I have seen at least one very large Java application that has poor performance because it performs lots of reflection. Other languages like Ruby have better support. Common Lisp Object System ( CLOS ) with support for things like before/after methods, class slot introspection, etc. also is very effective for applications requiring more flexibility than Java.

I add spell checking/correcting to online word processor

I had a little time this weekend to add spell checking/correcting to my online word processor . There are some general directions on the "About" page. This system is still in beta (I have just written it in the last week and a half in my spare time) so please email me with any bug reports. Thanks!

WebSphere Community Edition: not available until the end of the year

I was enthusiastic about the announcement from IBM: bundling WebSphere Community Edition with Eclipse development plugins etc. sounded great but after a fair amount of searching I saw that it will not be available until the end of the year. I don't intend to be critical of IBM: I appreciate their Java SDK for the PowerPC architecture, their support of Linux, etc., etc. However, it seems like this early announcement is just to generate some buzz for IBM's purchased Gluecode and perhaps to rain on JBoss's parade. While I am very happy with my development environment for simple JSP, JavaBeans, custom tag libraries, etc. I would like a super productive environment for use when I want to use more of the J2EE stack. The JBoss Eclipse IDE looks like it may be a possibility and IBM's WebSphere Community Edition looks like another possibility - I think that it is worth waiting for this to sort itself out. Meanwhile, I find it ironic that my favorite Ruby Rails development syst

Classic computer science text and papers on the web

I have seen the old Frank Sinatra movie that my wife is watching so I decided to read tonight. I still appreciate my rather large personal library because physical books, journals, and reprints are more fun to read than balancing a laptop. That said, I thought that it would be interesting to share some of my favorite links to classic computer science texts and papers: Structure and Interpretation of Computer Programs is arguably the best single text book Original 'Lambda Papers' by Guy Steele and Gerald Sussman General reading on Kolmogorov Complexity; for example: 1 2 This is just a sampling taken from my bookmarks. The ACM has a program to make classic texts available. Other must-study 'classics' like "Introduction to Algorithms" (Cormen, et. al.) are still too new (first edition 1991) to be freely available. I find it interesting to re-read material that I used many years ago - you get a different take on papers and texts after years of experience, making updated: added document export and search (or: sometimes it is good to be lazy)

I updated my new web portal to support search and exporting all of a user's documents compressed into a ZIP file. I only had an hour or so free today to work on this, so I got lazy: I had intended to export the rich text documents that a user keeps on the web portal as OpenDocument formatted files for, AbiWord, etc. I often read (programatically, that is) OpenDocument files: easy: it is a ZIP file, so just grab the contents, style, whatever that you need as ZIP file entries. The problem was that it looked like a multi-hour task to take the internal rich text format used on the web portal (which is really just HTML snippets) and generate equivalent looking OpenDocument formatted document files. So, I got lazy and just bundle all of a user's documents in one ZIP file for users to download. It turns out that imports these ill-formed HTML files just fine - so, I am happy that I did not spend the time right now generating OpenDocument files.

New web app for editing rich text documents (or: Rails vs. Java saga continues)

I have been experimenting with parallel development of a web app in both Ruby on Rails and Java/JSPs. I just hosted the Java version at if you want to try it. I did diverge a lot in the two parallel implementations: I ended up making the Java version for single users while the Ruby on Rails version has groups and access control lists. For fun, I also used different JavaScript rich text editing libraries for the two projects. I used a very light weight library for the Java version at because I intend to leave that web application running and let people use it for online document editing. One word of warning: I don't have the export to OpenDocument format implemented yet - until I do, there will be no way to transfer your online writing to your local word processor. Anyway, I started this project for my own use, so the Java single user (vs. groupware) version meets my needs and will be the version that I continue to work on and maintain. Anyone else is also w

Rife with automatic CRUD scaffolding worth looking at

I have been looking at Rife (a component based Java server side framework) for a while, but have not invested the time to do anything other than read though tutorial examples. has released an automatic CRUD scaffolding framework that at least on the surface is a little like Ruby Rails scaffolding. You can read about it here. Rife CRUD scaffolding customization does not look nearly as flexible as Ruby Rails scaffolding, but that is fair: Rife does its scaffolding at runtime so there is no code to tweak - different than Ruby Rails where you can edit the generated .rhtml, Ruby controller, etc. files. I like the superb runtime performance of simple JSPs (with custom tag libraries) but there is no reason to believe that the runtime overhead of Rife and Rife CRUD scaffolding would be too bad -- servers are cheaper than programmers (usually, except for very large scale deployments) so the agile nature of Rife looks good. I am hoping to have time to download Rife CRUD later today or

Great interview with Brent Scowcroft

I have always admired Brent Scowcroft. He was the first President Bush's national security advisor, and is well known as a realist - I think that he is the kind of guy who looks first to what is good for our country which is why he is widely admired by both democrats and republicans. This is a great article. Everyone should enjoy this and gleam more insight into Washington politics and power. Great read!

Getting back to simplicity

There is a trend in computer usage that I like: getting back to simplicity. Unless you want them to for entertainment, computers should simplify your life, not make things more complex. I think that this goes for developing software also. As a Java developer, I used to want to know and understand every nook and cranny of the language, standard libraries, the JVM, and the entire J2EE stack. Now, I find myself concentrating just on those technologies that are most useful for getting work done: on the server side JavaBeans, JSPs, custom tag libraries and a small number of persistence strategies. On the fortunately rare occasions that I need to write JFC based clients, I simply use the NetBeans UI designer to bang out the framework with event handlers stubbed out. Simplicity of tools and approach, and concentrate on problem solving... I have also been really getting into the simplicity and consistency of the Ruby on Rails framework. Except for very large scale web application deployments,

Side by side comparison of Ruby on Rails vs. JSP + JavaBeans

As both a research project and to build something for my own use, I have been (as time permits) developing a web app that lets me use JavaScript on the client side to do styled/rich text editing and to save edited files on one of my servers. What is different here is that I do a little development on one platform, then on the other - but being sure to use both RoR and JSPs+JavaBeans for new coding (vs. simply porting what I have already done). I am finding Ruby on Rails to be a faster prototype/development environment, mostly because it takes over 10 seconds to re-test under Tomcat (remote access via IntelliJ) and re-testing under RoR is almost instantaneous. Ruby also is a little faster to code in. No way can I put an accurate number on how much faster Ruby on Rails development is, but I would estimate that I save about 20% of development time - not a huge deal, but nice. This comparison is actually comforting from a Java perspective: for customers who prefer Java on the server, the e

The more things stay the same: distributed tuple space toolkits like Jini

Long ago at SAIC, my old Friend Tim Kraft (now doing great things at Overture/Yahoo) and I did a commercial product that was loosely based on David Gelernter's Linda data model. About 5 years ago, at Intelligenesis, we looked seriously at Jini-type models for distributed processing. Anyway, I noticed that the new Jini starter kit is now Apache licensed which should allow what is a great technology (for some distributed data problems :-) to be used easily by commercial application developers. Worth looking at if you have never experimented with Linda, JavaSpaces, or Jini before.

Code generation wizards: Ruby on Rails vs. Microsoft (and dynamic programming languages)

Please excuse me in advance if this is not a fair comparison: the last time that I used Microsoft VisualStudio code generating wizards was about 8 years ago at Angel Studio where I wrote a 'network play' library for multi-user PC racing games. The generated code was so obtuse that no human being could really understand it (well, I tried). I ended up writing my own equivalent library using UDP (using the same API so we could build against either library). Anyway, that gave me a feeling of bad aji (Japanese word for taste - used in playing Go and refers to good or bad shape of stones played - good aji or bad aji). Needless to say, VisualStudio's code generator generated code with bad aji! Ruby on Rails is really three things: a code generating wizard, utility class libraries, and a runtime component. The RoR code generator produces easy-to-read Ruby code and .rhtml templates (look like JSPs except for embedded Ruby code rather than Java code). The generated code is so reada

My notes on using RadRails for Eclipse + Ruby on rails development

I have not seen (yet) very much online documentation for RadRails so it is worth noting a few things that I do: Start by installing (as needed) Eclipse, RDT, and RadRails Use RadRails to setup a new blank Ruby on rails project In Eclipse, make sure that you switch to the "Rails perspective" (instead of the default "Java perspective") Design the database schema for your application and be sure to follow the RoR table naming conventions (e.g., a table named "users" for an automatically generated model for class "User", table "doc_acls" for model "DocAcl", etc.) Rails will want 3 databases with names ending in _development, _test, and _production). Create your development database with your application tables. Then, in Eclipse/RadRails, edit in the database information Outside of Eclipse, generate models and scaffolds; for example: "ruby script/generate scaffold User Admin", "ruby script/generate model Group"

Our Java portal for recipes is almost feature complete

I am still working on the AI agent for custom recipe creation at our Java portal for recipes but otherwise the site is just about feature complete. Our site has a unique feature: users create free login accounts and then they can maintain a private database for ingredients on hand. They can then search for recipes and optionally specify that only recipes for which the user has the ingredients for will be shown. The AI agent functionality is (or will be :-) also cool: the user can specify which ingredient that they have on hand that they would most like to use up. The also specify a cooking style (e.g., American comfort food, Japanese, etc.) and how spicy they want it. The AI agent will then search for recipes that the use the specified ingredient and for which the user has many of the ingredients for - then the recipe is customized for cooking style and for only using ingredients in the user's "on hand database".

Sharing and the gift economy

Carol and I decided to license the recipes on our new healthy foods and recipes web portal under a Creative Commons license that allows people to re-publish our stuff with attribution and a reference to our web site. For us, this makes sense: we want people to use our web portal and sharing recipes is one way to promote it. I do the same thing with my free web books and open source software projects: by sharing I get good feedback and improvements from many people and because hundreds of other web sites reference my main web site, when anyone searches for a "Java consultant", there I am :-) I frequently push back a little with some of my customers who want to keep everything they do closed and proprietary. Unless you are helping a competitor put you out of business, sharing has many rewards.

Nice business model: Intergalactic Medicine Show

I saw this on Slashdot this morning and signed up for the first issue. Well worth $2.50 for 2 reasons: the SciFi short stories (so far) are good and I like to spend small amounts of money to support business models that I like (usually decentralized, small businesses that make good use of the web). It looks like Orson Scott Card is providing a good venue for SciFi writers to get published. I especially like SciFi short stories because you can take a short break and enjoy an entire story at once.

Really bad decision over NASA's funding

I think that it was the current science-ignoring administration and Congress that made the really bad call to reduce funding for unmanned robotic space exploration because of an emphasis on entertainment over science. Let's be clear on one thing: we get far more scientific value per dollar by sending huge numbers of tiny devices throughout our solar system (and beyond) than putting humans just outside the earths gravitational well. I believe that funding for unmanned missions historically have cost just a few percent of what manned missions cost. If it were my call, I would fund both manned and unmanned exploration, but definitely not cut funding for the high value relatively low cost unmanned missions. Absolute stupidity.

Easy update of Ubuntu Linux to new release

Why can't Microsoft make upgrades this easy. A few caveats: Ubuntu is not officially releasing "Breezy" until tomorrow, so I did this on my laptop (which is not my main Linux development system): In the Synaptic package manager, under Settings -> Repository, I manually edited my repositories changing all occurrences of "hoary" to "breezy" and I removed the install CDROM as a repository source. I then clicked the "Mark All Upgrades" taskbar icon and then clicked "Apply" - when asked, I chose the "Smart Mode" upgrade that apparently is meant for upgrading to new releases. One particularly great thing: under "hoary", I had to build and install my own driver for the RT2500 wifi device in my laptop and manually start it. After the upgrade, wireless is on with no manual operations. Note that with the RT2500, when booting Windows XP, I have to manually start wireless. I am having zero problems with my laptop since up

Ruby tools improving

Anyone who reads my blog knows that I love to write Java code using IntelliJ. As much as I prefer the Ruby language to Java, better IDEs and more available infrastructure software keeps me (for now) firmly in the Java development camp. However, Ruby tools are definitely improving. The Ruby Development Tools for Eclipse keep getting better and the new RAD IDE for developing Ruby on Rails applications ( RadRails ) is looking very promising. Besides congratulating the RDT and RadRails developers for some cool work, I actually have a point to make: In the new IT world, developers concentrate on providing services to users that are based on centralized data stores like existing relational databases, generated data from web services (via XML-RPC, REST, SOAP, etc.), semantic data (RDF, OWL, etc.) repositories, document stores (e.g., Microsoft's SharePoint, WebDAV repositories of documents), etc. Choice of programming language and runtime platform seems to be less important

SOA and

While reading an interview with OOo developer Florian Reuter I kept thinking of two very different possible future IT worlds: Microsoft continues to dominate the market and business processes continue to be hobbled by closed (or at least opaque) file formats. The Open Document standard becomes a (close to) universal standard with many interoperating software systems that work nicely with each other to facilitate "knowledge flow". People and organizations get the maximum benefit from their data and information assets for the lowest cost. I really wish that Microsoft would get on board with the Open Document specification. As I have written recently, I believe that Microsoft's best strategy is to switch to a subscription based licensing for Windows and Office so that they get a yearly fee from users - this frees them from having to add features that few people need and concentrate on quality. If they play fair in supporting the Open Document standard, they will be in a goo

User's perspective on web services like and GMail: running stateless

Sometimes I am questioned on my preference for using web applications over local application programs. For me, it is largely about "running stateless": any slight loss of efficiency over using local applications and data is more than compensated by having all of my stuff available no matter where I am or which computer I am using. Of course, I am not really running stateless, but someone else is maintaining state for me. An alternative is keeping all work on a networked file server but then there is the problem of having the right application handy. Using XWindows is also a good alternative: I remember in the mid 1980s having to do a lot of work on servers located in Norway, my office was in La Jolla California. Back then my internet connection was a few hops between satellites and ground stations so latency was really bad: ground/sea based fiber is so much better, the speed of light being limiting after all :-) For my own use, I am experimenting with a web application that I

Writing my own message board for second time in 2 years

No one likes to reuse code more than I do. The first thing that I look for is good quality GPL (when applicable) or BSD/Apache type licensed code to reuse rather than "reinventing the wheel". My wife asked me to add a message board to our recipe web portal so that people can comment on recipes and suggest improvements for the system. There are several good open source Java JSP/servlet based message board projects but I decided to write my own from scratch. Why? Good question! In our case, the message board needs to be tightly integrated with existing user accounts, knowledge about recipes (so that comments can optionally be linked to specific recipes), etc. I did spend time looking at the source code to existing systems, but decided I wanted a custom solution that is tightly integrated with both the relational database for the web portal and the software that I have already written for it. The same thing happened a few years ago. I was writing a "SharePoi

Oracle buying InnoDB owner

I have always preferred PostgreSQL over MySQL - but, I usually end up using MySQL because it is installed and configured on servers and virtual servers that I rent on a monthly basis for my customers' and my web portals. MySQL Corporation's contract with InnoDB's owners is up next year, but Oracle plans to renew some agreement - it will be interesting to see how the use of MySQL in non-commercial licensed environments holds up. I am thinking, but have not made a firm decision, of switching over to PostgreSQL for all deployments. This will add to the (now small) overhead for renting a server and deploying a web application. I might also start to favor hosting companies who provide pre-installed PostgreSQL.

I am back from vacation

Carol and I went on a Mexican Riviera cruise with my parents, brother, and sister in law. Carol and I went white water rapids rafting east of Acapulco, rode zip lines 100-150 feet above the ground in a rain forest canopy east of Puerto Vallarta, and went kayaking and hiking on Deer Island (off shore from Mazatlan). Otherwise, I just enjoyed some Mexican beer, tried to avoid over-eating (not easy to do on a cruise ship), enjoyed walking around Mexican port cities, and kicked back. I have been working really long hours for about 8 months straight, so the time off was great.