Showing posts from April, 2006

Information: organization vs. overload

I will get to the topic in the title, but first: This month's issue of the Communications of the ACM has a great series of articles on exploratory search: lots of good ideas on organizing sets of search results rather than single documents, clustering vs. categorization, etc. Another good read: May issue of WiRED magazine has good coverage of vblogs and online video - the sort of grass-roots publishing that I like :-) There is so much information to absorb and use for any type of knowledge worker that it takes time and effort to stay up to date with what we need to do our jobs. Much of my work involves writing custom software (usually layered on open source) for information management in specific industries/applications (large scale search, document categorization and repository maintenance, AI style data mining, agent technology to assist by bringing important things to user's attention, etc.) but I find it ironic that I can not seem to set aside the time to write much custom

Suggestion for Bush and Congress over what to do about gas prices

Apparently the "best government that corporate lobbyists can buy" is considering giving U.S. tax payers a $100 credit to help out with the higher cost of gasoline. A little reminder: these are the same "public servants" who just gave $5 billion in tax credits to the oil industry and a few years ago instituted huge rebates for people who bought very low gas mileage vehicles (here in Arizona the local news had clips of people lined up to buy hummers before the $10,000+ tax rebates went away). OK, here is my advice: Give tax credits for high gas mileage vehicles (Duh!!!) Show a little honesty and decency and levy short term excess profits taxes on the energy industry - just take back a little of the congress-to-corporations give-aways Institute a luxury tax on low gas mileage vehicles (Duh!!) Outlaw no-passenger use of inefficient vehicles for a few time spans per day (for example: 8am-9am and 3pm-4pm) - make it hurt to be an energy abuser Really, just use some common

Long term AI job: what language to use?

I have a pleasant decision to make: I am going to soon startup a long term consulting job and I have been given some flexibility in programming languages to use. I am fairly sure that I am going to use Ruby for all utilities for file handling, data conversion, etc. and use Common Lisp for the main AI component. I made a strong pitch for doing the web interface using Ruby on Rails, but my bit is the back end AI work, so I don't care too much what other people on the project use. I am still considering doing the AI part in Ruby also, but I am a little concerned about performance issues. I have also heard strong rumors that an awesome knowledge representation system PowerLoom that is available in Java, Lisp, and C++ might be open sourced. PowerLoom is a great framework - in the past I have only experimented with it because of its non-commercial license; if it is open sourced, I think that will change the landscape quite a bit for choosing frameworks for AI projects.

Protecting the IT infrastructure and competitiveness of the U.S.

Our comfortable middle class lifestyle in the U.S. depends on our being competitive with countries with much lower costs of living. If Congress and the current Bush administration do not bend over backwards too far catering to the whims of their constituents ( Big Business ), I believe that we can be competitive in the technology sector. A few things that must be done (or not done): U.S. companies and individuals must have access to a "full service" internet - no turning over control to a few large telcos No encumbering of open source software - any legislation that impedes the use of open source software in the U.S. will simply put us behind countries without restrictive laws By all means protect intellectual property, but do so in a fare and balanced way - in other words, don't bend over too far for big business Our elected officials must remember that while open competitive markets are a good thing, for capitalism to work over the long term governments must prevent mu

Oracle making their own Linux distro?

Perhaps this is just the next logical step for the commoditization of operating systems! Application and tool developers offering a complete software stack so they can control everything, reducing customer support costs and generating more revenue by selling larger bundles. BTW, about 4 or 5 years ago I was seriously considering making my own Linux distribution that would have contained all of my favorite programming (C/C++, Lisp, Prolog, etc.) languages and AI frameworks (Wordnet, all free Lisp tools, etc., etc.) all configured out of the box. The problem was that it would have been easy enough to do one time, but making up a new distribution every time the base Linux distro that I was using was updated would have been a long term time sink. Right now I use Ubuntu and it is not such a hassle for me to add in the stuff I like every 6 months when Ubuntu gets updated. When I finish my current book project, I might look into making a few AI oriented packages for Ubuntu to save other peopl

Web application integration

I am talking about tight integration and not web mashups created by writing a web application wrapper for other people's web services. GMail and Google Calendar integration looks promising at this point, but we will have to wait and see how far it goes. Even though I wrote my own web based wordprocessing web app ( ) I am still a little skeptical about how widely used web based document processing will be. Anything that challenges Microsoft Exchange/Outlook integration, but implemented as a web application looks really interesting to me. Because of privacy and IP protection concerns for corporations, I think that an appliance style product for combined search, email, and calendaring would be a great business idea (hey, anyone want to start a company :-) I expect to see the costs (development, acquisition, downloading, deployment, maintenance, etc.) of infrastructure software to continue to be squeezed lower. Priced Exchange licenses lately? :-)

Owning your own business

My brother was visiting us this week and we had time to hang out and talk. We both have our own businesses. Ron owns a couple of optometry offices in San Diego and I have my consulting business. Different businesses to be sure, but we both are in business not only to make some profit to support our families, but also we care about our customers. This probably sounds "corny", but never the less it is true. My brother has had many customers for over 20 years, and it bothers him to think of selling his practices to completely retire because of his long-term customers. I tend to do many small jobs, usually to get someone through a crunch, to help jump-start a new project or to perform some maintenance on an old project; I care about doing good work and for always giving flat-out honest opinions and advice. I manifest care for my customers by turning down work that is not in one of my fields of expertise. Even though I made much more money working for large companies as an employe

Working backwards on the Semantic Web

I just had a bit of an insight: I think that many people, including myself, may be taking the wrong approach to working on the Semantic Web. I think that dealing with XML serializations of RDF, RDFS, and OWL is just plain wrong. Tools like Protege offer a frame-like UI that makes it a lot easier to work with (and free descriptive logic engines like Fact++ help by checking for consistencies). However... I have had a little free time today to work on a pet project that Obie inspired: write Ruby wrapper code for making it easier to deal with RDF/RDFS/OWL by loading files and automatically mirroring classes, etc. I would work in Protege, then write Ruby code to consume the RDF/RDFS/OWL files so that I could work in a decent language. OK, fine. However, this all still seems more than a little wrong to me. Since the Semantic Web is largely about ontologies and knowledge representation, why turn our backs on decades of AI research? Why not work with knowledge representation systems written in

Disney-ABC start to get it

Broadcast TV is dead - let's remember it fondly :-) Seriously Disney-ABC's pilot program to allow free download and viewing of TV shows a day after they are first run makes sense. They will, I believe, use some form of DRM to insure that commercials can not be skipped, but that seems fair enough to me. I bet that only Windows based viewing is possible, but there is always some hope that at least Mac OS X will also be supported. There is a lot of competition for viewers' "eye balls" for all types of entertainment and media, so re-purposing material for free (with commercials) anytime viewing makes sense. I have been enjoying old 1950s Alfred Hitchcock TV shows, both purchased on iTunes music store and rented via Netflix - another good way to enjoy anytime viewing. I also have hopes of more budget/indy productions making it to free or low cost anytime viewing.

What about your $27,000 debt?

How concerned should we be that the average American's share of the national debt is $27,000? Make that $108,000 for a family of four and we are talking real money. So what? Isn't the debt as a fraction of GDP OK? The GDP is a horrible measure of economic health. A forest fire burns 1000 homes and the GDP goes up. Hurricane Katrina destroys a city and the GDP goes up. Our government spends irresponsibly and this excess is paid for by selling off more of our country to foreign investors - this increases the GDP! Everything adds to the GDP and nothing reduces it: "The nations brightest economists maintain our national accounting system with a calculator that has a plus key but no minus key" (Lincoln Anderson on US GDP in The Fortune Encyclopedia of Economics) The real short term risk of this debt is that interest rates will have to climb to keep attracting foreign investment to keep our society functioning and we are likely to get hit with increased inflation at the

Interesting: Bill Gate's work flow; knowledge management

I enjoyed this article by Gates on cnn on his personal work flow. The best part was on SharePoint. A few years ago one of my customers hired me to write a SharePoint clone that was tailored for their work flow - I used Java (JSPs, custom tag libraries, and Prevayler for persistence) in a fairly agile way but I would like to redo that project, with lessons learned, using Rails, and use AI text mining technologies to help automatically organize information (or suggest organizing hints). Knowledge management is something that I am keenly interested in because it ties together lots of technologies that I am interested in: ontologies, knowledge representation, data constraints, server side technologies, and natural language processing with text mining, etc.