Posts

Showing posts from January, 2005

Patent threats to US IT industry and my ability to earn a living

I am getting ready to write a letter to my two Senators and Congressman (from Arizona), so this blog entry is a "dry run": As a technologist living in the US, I welcome global competition. Although it is a little rough competing with people who live in low cost of living countries like China and India, I am up for that challenge. There is another threat to my ability to earn a living (long term) that does concern me: Ultimately, open source software is the future and in the US the writing is on the wall: corporations who see open source as a threat will try to pay off Congress to pass legislation that removes people's freedom of choice between commercial and open source software. They will do this with attempts to pass discriminatory legislation, threats of patent based lawsuits, and eventual litigation. I recognize and support the rights of individuals and corporations in regards to proprietary property, but the lax issuance of software patents has been unfortunate

Using WingIDE for a GPLed project; Java vs. Python IDEs

A few week ago I was complaining how mediocre Python IDEs were compared with Java IDEs like IntelliJ, Eclipse, and NetBeans. I take that back: I am working on a GPLed project in Python in my "spare time" so I bought a personal Wing IDE license for use just on this project. The more I use WingIDE, the more I like it. I was wrong in my previous post.

Ubuntu Linux: job well done

After reading about Ubuntu Linux on Slashdot yesterday, I decided to burn a Live CD and try it on my dual G4 Mac desktop - very nice! For many years, KDE has been my Linux desktop/window manager - but, after seeing Gnome on Ubuntu (very well configured!), I may change my mind. Don't get me wrong: I think that Mac OS X is the most productive operating system for my work (software design and implementation, writing). That said, I find it exciting that Free Software alternatives are getting so good. Except for being able to edit video movies, Linux comes close to OS X for just about everything that I do. Since the final target/deployment platform for just about everything that I do is a Linux server, I might decide to do more development under Linux in the future (back to the future - I did almost all of my development on Linux up until a few years ago when someone gave me a Mac OS X box). Anyway, the Ubuntu Linux team did a fantastic job getting a good looking (great fonts!) c

good read: "Free Software Magazine"

I was looking through the Free Software Foundation web site and saw a reference to a new magazine: Free Software Magazine . The first issue, available as a PDF download looks great. My favorite article is "Motivation and Value of free resources", but other articles are also interesting. The editor wrote another interesting article on the technologies used to produce the magazine (RTF -> OpenOffice.org XML -> their XML format --> output using XSL). Anyway lots of good stuff - check it out.

Love BitTorrent for Linux distros; why not for 'Indy' TV productions?

BitTorrent is the greatest technology. Although I have SuSE Linux on my development Linux server, I wanted to install Debian on another (virtual (*)) Linux server. The BitTorrent links on the www.debian.org site make it simple to download ISO CDROM images and the BitTorrent technology does not eat up Debian's bandwidth. Way nicer than using FTP or HTTP. I wrote about this a while ago: I don't see why TV networks don't distribute popular TV shows using BitTorrent technology, perhaps in a low-res 320x200 pixel MPEG format - a little lower resolution than TV, but very watchable. Leave the commercials in to pay for production costs. I think that it would be great to see a business become a low cost 'Internet only' TV network. It could start a whole new thing: "Indy" TV shows in addition to "Indy" films. Anyway, I think that it is a cool idea. (*) using VirtualPC on my Mac OS X system

I wish I had written this :-)

Here is something really well written and accurate: a realistic assessment of the weaknesses and strengths of the US economy. Of course, no one can predict the future. Still, it is possible to understand some of the factors that are a threat to our economy and what we have going for us. The figures on personal debt (and the linked graph) are surprisingly bad. Almost every smart person who I talk to about the economy agrees that eliminating your personal debt right now is most important; if you have large personal debt, simply stop spending money for a while on non-essentials. As individuals, we have no control over the trade imbalance, when the world will switch off of a gold standard, the effect of research/development/creation of new industries/etc. offshore instead of inthe US, investment in India and China being more attractive in investment inside the US, etc. But, we do have control (more or less) over our own lives :-) What is needed, I believe, is a longer term view. Fa

Disappointed with Sun's terms on 'patent sharing' with open source

Sure, a good looking press report, but Sun is only doing the 'patent sharing' with projects licensed under their own CDDL license. CDDL is the Mozilla license with one important thing removed: a clause permitting re-license under accepted open source licenses. As many people pointed out on Slashdot this morning: these patents can not be used in Linux. More interesting news: I read this morning that IBM is likely to release a Linux license compatible Java runtime sooner than later. This is really big because any Linux distro can then support the Java platform 'out of the box'. Cool! I own both Sun and IBM stock, but I must say that I find IBM's management decisions in the last few years to be more to my liking than Sun's!

Easy GNU gcj Java native compiler install on OS X

Tired of painful gcc/gcj builds on Mac OS X in order to get native Java compilation? Here is a Sourceforge project for high performance computing tools for OS X. Gaurav Khanna (Physics Department, University of Massachusetts at Dartmouth) has pre-built gcj. Here is a link to a pre-built kit. Assuming that your web browser unGZIPed the file to gcj-bin.tar , just type: sudo tar -xvf gcj-bin.tar -C / to install in /usr/local. You can compile and run like this: gcj --main=Test -o Test Test.java ./Test For most purposes, stick with Sun's JDK and runtime kit - however, there are great reasons to also be able to generate native Java executables; for example: faster startup for small programs, test development targeted for Linux for GPLed projects that you do not want to use Sun's non Free Software kit for, etc.

New paradigms for web app development

I have been talking about J2EE, Python and CherryPy, and about Ruby Rails a lot lately. Although I still prefer Java for large (well funded :-) web applications, the advantages of dynamic languages for database table wrapper generation, continuation based navigation (e.g., Smalltalk Seaside, SISC web continuation examples, etc.) are compelling. I suspect that in a few years when the infrastructure for Python 'catches up' with that for Java, that I will end up doing 60% or my development in Python and 20% in Java - the opposite from my current language choice. Still, I think that the reliability and scalability of the J2EE platform will make J2EE the "COBOL of the 21st century": there will be more agile tools available, but J2EE will be a platform of choice for large web applications that provide web services, talk to web browsers, PDAs, cell phones, etc. Scripting languages like Groovy might help make Java development more facile in the future, but I am taking a wa

Open publishing of scientific research papers

The SPARC is a group of universities and publishers trying to provide free access to published journals. I noticed this morning that The Journal of Machine Learning Research supports SPARC and provides PDF copies of journal articles on their web site. Much appreciated!

Solid J2EE applications

I have been writing lately about J2EE alternatives (*) for building web applications much faster. The great thing about J2EE web applications is that in my experience, once a project freezes requirements and bugs are fixed, J2EE web apps often run forever without any intervention except for doing backups (**). Combined with cheap Linux dedicated servers (I sometimes use $50/month servers - sweet price!), the cost of running web applications does not have to be prohibitive. Another great way to keep costs down is up front: try to develop as little new code as possible while leveraging open source or commercial libraries and frameworks. (*) CherryPy is a pythonic, object-oriented web development framework that seems just right to me for publishing behavior of Python objects as web services and web forms. Ruby on Rails is a very cool framework for quickly building database centric web sites. RoR's wrapping of database tables as objects is just like the way I use a generator in

OS X + Linux: perfect combo! Also: request for advice on SuSE Linux dedicated server providers

First: if anyone has any info on good dedicated server hosting providers that use SuSE Linux, please email me at: markw at markwatson . com -Thanks!! I am finding the combination of my Mac OS X desktop (and iBook :-) running an X11 display with SuSE Linux for servers on my own network is just about perfect: I prefer SuSE's Yast administration tools and keeping Yast open as required for each server with the display set to my Mac is just about perfect. For customer installations, I usually use rented servers at 1and1.com and serverbeach.com but they only provide Redhat Linux based servers. Sometimes, I like to do development under Linux in addition to OS X (for a few different reasons) and here again, OS X's X11 support makes it pain-free to develop on Linux: for example: great looking Mac fonts, not need to use the Linux server's display (actually, I usually run Linux boxes headless around here :-), great for offloading long builds, etc. from my desktop OS X box

Tomcat for Java, CheryPy for Python web apps?

I have been checking out CheryPy for a roughly "Tomcat equivalent" framework for building dynamic web apps with Python. Most of my consulting business is based on building web applications written in Java that use Tomcat for a servlet/JSP container, containing background work threads, etc. In the last year I have been using Python more and more (especially since I spent a day at Google last March and almost every engineer I spoke with raved about Python). I still think that Java+Tomcat or Java+full J2EE stack is best for most heavy weight web apps. However, I am finding Python to be a compelling language for small and medium size projects - higher programmer productivity than Java (perhaps even a more productive language than Common Lisp). I am hoping that CheryPy will provide a platform for building small and medium scale web apps very quickly.

I have finally switched to Java 5 language features for new development

Not for consulting work (yet), but for my own development for new code, I am starting to use the new JDK 1.5 (or 5) language features. Since I run under Mac OS X, you might think that this is a problem, but I have a Linux box on my home office network and I simply use X Windows on my Macs. One of the things that I have always loved about Common Lisp and Python is how concise code can be; the new Java language features help a lot. For a contrived example: import static java.lang.System.out; import java.util.*; public class test { public static void main(String[] args) { out.println("JDK 5 new language features\n"); List<String> strs = new ArrayList<String>(); strs.add("cat"); strs.add("dog"); for (String s : strs) out.println(" " + s); } } Cool, but this is still pretty weak compared to Python's list comprehensions, etc.

Why don't more customers want to save money with Open Source development?

The desire to totally own something can be expensive. I offer customers a 33% discount when they hire me for either Free Software (GPL) or Open Source (MIT, BSD, Apache, etc.) licensed projects. In addition to that savings, development time on GPL projects is almost always shorter because existing GPL software can be used. Also, with open source projects there is always at least some chance that other users will contribute useful code back to you (*). I have talked to several of my customers at length on this issue. Most simply want to totally own the rights to what they pay for - no argument from me if that is what people want. Still, I think that many customers overate the value of keeping software that they fund proprietary: it is a "big world" out there and any small company is likely to have a huge number of competitors in their product and/or service space. For some projects, it makes sense to me to cut infrastructure software costs "to the bone", go open

Has Novell screwed up SuSE Linux?

OK, I am irritated: I installed SuSE 9.1 Personal a while back - looked good at the time. Today I needed to test Python stuff for Zope/Plone under Linux: I discovered that SuSE 9.1 Personal does not include the GCC C Compiler! I am hunting around the web for SuSE 9.1 RPMs for GCC -- what a hassle! I can not believe that any Linux distribution would not include a C compiler. I think that I am going to just wipe 9.1 and install from my old 8.1 personal CDROM. ... or, maybe invest an hour and do a Debian install. PS. I ended up just burning a CDR containing RPMs from a SuSE mirror FTP site for gcc, gcc libs, and emacs - and stored this CDR with my SuSE personal install CD. I am still a fan of SuSE, but I still find it annoying to not include a C compiler with even a trimmed down Linux distribution.

The best Java CMS system

I wrote about the Daisy CMS system last year and I have spent more time with it recently. Daisy is released under the Apache 2 license and was written by Outerthought (under funding from Schaubroeck). Daisy is a bit of a rough install (the first time I installed it it took about 30 minutes to set up the MySQL tables, and get all three server processes running (each runs in its own JVM). One of these processes is OpenJMS. I want to re-work Daisy so that it runs under JBoss with minimum installation hassles (e.g., use the built in JBoss Jorma JMS service and perhaps even use HSQL by default with automatic table generation). I would like to extend Daisy (still under the Apache 2 license(*)) to be a snap to install and run with JBoss. Daisy is cool! (*) in the future I would like to also add my KBtextmaster technology and make a separate commercial product that is a low cost turn key complete document management system.

Updated my white paper "Jumpstarting the Semantic Web"

I added a new section to the end and fixed a few small other details. If you are interested, you can download it using this link . This is a paper that could be described as "what I would like customers to pay me for working on" :-)

Interesting stuff on Semantic Web; my idea that I am going to hash out

Danny Ayers links to Peter Norvig's comments on the Semantic Web . Both Danny's and Peter's comments are well worth reading. Later today(*) I plan on updating my "Jump-starting the Semantic Web" white paper, but here are the basic ideas I want to explore: I have been thinking about an idea mentioned in Antoniou's and van Harmelen's "Semantic Web Primer" (a nice book, BTW): use ontologies like (in an object oriented sense) an interface to information. Good Java style calls for coding to interfaces so underlying implementation classes can be modified, swapped for other classes implementing the same interface, etc. Consider a similar strategy for dealing with mass amounts of information that uses different vocabularies, formats, etc.: create an ontology for information that is high value to your business identify sources of raw data for the topics covered in this ontology write custom connectors to the raw data sources that convert vocabula

US food imports at an all time high? WTF!

From the Washington Post (AP): "The U.S. trade deficit hit an all-time high of $60.3 billion in November as American appetites for foreign oil and even imported food reached record levels." I thought that we were a nation with lots of agricultural production capacity. Don't like what is going on? Contact your representatives in Congress! These are the people getting mega-bucks from special interests! Push back a little and ask them if they would like to be re-elected. I will let you in on something that is not quite a secret, but is still little known: Clinton, Bush, and a well paid (by special interests) Congress passed laws making it really easy for corporations to register off shore and avoid a lot of their US tax burden. Now the deficits are going through the roof, the rich and corporations easily move assets out of the US to avoid the looming economic crash , and regular folks are going to get screwed. I just thought that you might like to know wh

Updated version of "The Software Development Book for Java Developers" available on my web site

You can grab a PDF file of the new version on the Free Web Books page of my www.markwatson.com web site . I have made major changes to my Free Web Book "The Software Development Book for Java Developers" today. This book is still incomplete, but I did work on it for 3 hours today. Now the chapter layout is "cast in stone" and the book itself is about 60% complete. Enjoy! I have good intentions of finishing this book in the next month or two, so you might want to wait.

Source code for demos; future commercial and Free Software projects

If anyone is interested here is a ZIP file with User Guide and source code for demos for my KBtextmaster product. The Java code base is frozen, but I will not start selling the product for another 3 or 4 days (extra testing!). Anyway, this has been a major amount of work for me and I am looking forward to finishing up and starting development on my next planned product. My plans for the next year are: 1. Port Apache Daisy content management to run under JBoss (use built-in JMS instead of OpenJMS, combine deployment of DaisyWiki and DaisyRepositoryServer in easy to install EAR file). I will add lots of the functionality of my KBtextmaster product for clustering, adding semantic information to the Lucene based Daisy search engine, etc. The idea is to have one very low cost (I am thinking about $50) commercial product that makes it is easy to set up and administer a document management and storage portal. 2. GPL Free Software project PyTextMaster: I want to take a good part of

IBM, patents, open source

Nice discussion of this over on Slashdot today. This is a super-smart move on IBM's part. Their license promises some pain to companies who try to sue authors of any open source projects for patent infringement. Good move. I believe in the utility of commercial, Open Source (Apache, BSD, etc), and Free Software (GPL) - all useful licensing options for different purposes. Of course the best thing would be to not allow software patents, but I think that is a lost cause. I think that if market forces are allowed to naturally work (i.e., no corruption involved in passing laws that are good for the few and bad for the many) that in 20, 30, 100 (whatever) years, then we will see a progression to most software being either open source or small custom projects layered on open source.

API Javadocs for my KBtextmaster product

I finished the port to Java from Common Lisp of version 2 of my KBtextmaster product and just published the public API here . Tomorrow I will also publish the user's guide and a ZIP file containing the source code to the demo programs (e.g., a service wrapper for using KBtextmaster to index and cluster documents for a web application, a sample search and clustered document client, and several tiny examples for using the basic natural language processing APIs). KBtextmaster also provides pure Java solutions for reading Word, Powerpoint, PDF, OpenOffice.org, AbiWord, etc. files. I hope to finish final testing and start selling version 2 in 4 or 5 days. Version 1 was written several years ago in Common Lisp. I have been working on version 2 since early 2004, so this will be a major upgrade. I have a PDF file for a color product brochure here .

Legal uses for BitTorrent; suggestions for TV networks

There are already compelling legal uses for file sharing using BitTorrent; for example: distributing Linux ISO images. There is another use that makes all the sense in the world to me: I would watch more network TV if I could watch shows when I wanted. Why don't TV networks submit lower resolution copies WITH COMMERCIALS of their shows as torrents? Clive Thompson makes this great point in his article in WiRED (January 2005): any media with embedded commercials can be distributed cheaply using torrents. Thompson argues, probably correctly, that distributing via torrents does not work for the movie industry. However, how about ramping up product placements and perhaps do toss in a few commercials. TV and movie distribution would need something like the Neilson Ratings for TV in order to get proper credit from advertisers. iFilm.com seems to do alright streaming video with commercials before the video clips. This may not be a popular opinion, but I wish people "il

Web app development work: plain Tomcat vs. JBoss?

I just about totally base my Java consulting business on Tomcat (almost all of my work is in developing web applications). When I need a J2EE component (e.g., JMS) I use an external high quality open source package (e.g., JoNAS JMS). Instead of EJBs, I simply write and debug "middle-ware" code for accessing data sources, business logic, etc. After it is working, then I do the presentation layer with JSPs and struts (old fashioned stuff :-) Even though it is not as light weight as plain Tomcat, I am considering switching over to using JBoss 4.01 as my standard development and deployment platform and always have the full J2EE stack available. I have not personally used JBoss clustering yet, but from what I hear it is stable. I have never been a huge fan of EJBs but by annotations like XDoclet makes using EJBs fairly simple. Anyway, development and deployment computers are so much faster than they were back when I started doing server side Java development that I might as

Love PayPal - perfect tool for consultants and 'small time' software venders

Every so often I see people on Slashdot bashing PayPal but in my opinion they are simply not using the service correctly. To receive payments I recommend getting a separate savings account at your bank and only use this account for PayPal. Also, I tend to not let more than a few hundred dollars sit in PayPal before I take a minute and have the funds transferred to my savings account. PayPal is also great for contributing a few dollars at both people and organizations that you might want to encourage to keep working on projects that you find worthwhile. (I thought about this today when the Wikipedia servers were running slowly - it felt good to take one minute and send them a little money. Why not pay for my use of their servers?)

Great source for free online AI research papers

In the last year, I have been enjoying access to papers on the ACM library portal - costs some money but well worth it. I just saw a link on comp.ai to Constraint Programming Online . This site has a link to the "Theory and Practice of Logic Programming" site that has many online papers as PDF and Postscript.

Admitting bad habits

I have fascination for just about everything. I realize this and try to only hit the link to Wikipedia's random page view 5 or 10 times a day -:) I also have a practical interest with Wikipedia: currently I use the huge manually annotated Reuters news story corpus for machine learning runs to build categorizers. I want to try using the Wipedia entries also.

Something really nice

Today has been a grind working on documentation and demo programs for my Java KBtextmaster product. A friend just sent me a link to something really nice that was a good break from work. This is a short web movie that reminds me that most people really are good at heart and that all people are connected by shared experiences. I really recommend spending 2 minutes and watching this (it is a great antidote to hearing Ann Coulter wish that Tim McVeigh had visited the New York Times offices and hints that perhaps liberals in the USA should be killed(*)). Some people would like us to forget what all people have in common; you know that I am talking about the "either you are with us or against us" crowd. I believe that we can have a strong defense and protect our country and our way of life without embracing hate and promoting a feeling of separation with people who are different than us. It is time to turn away from darkness and dark thoughts and celebrate the ties th

Java trick for reading OpenOffice.org files

I have read a few times about people having problems reading the ZIPed XML files in OpenOffice.org documents. The problem is SAX parsers not being able to locate a local copy of the office.dtd file. I have been using a kluge to get around this problem for a long time and have not had any problems with it: When reading the input stream from the ZIP file entry labeled "content.xml", skip past the second ">" character: InputSource is = new InputSource(zf.getInputStream(zipEntry)); InputStream r = is.getByteStream(); for (int i=0, count = 0; i<500; i++) { if ((char)r.read() == '>') count++; if (count > 1) break; } SAXParser p = saxFactory.newSAXParser(); p.parse(r, new OpenOffice.OpenOfficeSaxHandler()); Hopefully in the future people having this problem will find this post when doing a web search and save themselves a little time. Another good alternative is to make office.dtd available on your system and put it on your class

Perhaps it is time to sell the last of my Microsoft stock

I saw an interview with Bill Gates linked from Slashdot. Yes, Gates really does compare Free Software with Communism. I suppose that some non-thinkers might swallow this bullshit. As someone pointed out on Slashdot, Free Software and Open Source software is more akin to a community bake sale at a church: people cooperating for local benefit. Gate's argument (as the same poster said on Slashdot) is like: protect the property rights of restaurants by banning church bake sales -- a great way to argue the point and refute Gates' lame argument. I differ slightly with people like Richard Stallman at the FSF because I think that commercial, Open Source (BSD, Apache, etc. licenses), and Free Software (GPL licensed) all have a place in the IT ecosphere. Still, Stallman makes more sense to me than Gates. I need sales from my commercial software products to augment my consulting income but I am always trying to find ways to also justify Free Software projects when they make sense. I a

Having fun in San Diego

I am in San Diego seeing family and friends. Sunday was great: I took my two young grandchildren to Lego Land for the day. I was having breakfast with an old friend (now a Yahoo search guru) in Del Mar this morning and I also ran into one of my favorite bosses from my years at SAIC (I worked for this guy 3 different times - good to recycle old bosses!) For late Christmas presents, I bought my parents a Mac iSite video camera and they bought me one - now, if I comb my hair I can talk with my parents and let them see me in my home office in Sedona. Should be fun. I don't meet many of my customers (since I telecommute) so it might be cool to have occasional short video-enabled teleconferences with customers also. I will be back in my home office in Sedona on Thursday.