Sunday, April 30, 2006

Information: organization vs. overload

I will get to the topic in the title, but first: This month's issue of the Communications of the ACM has a great series of articles on exploratory search: lots of good ideas on organizing sets of search results rather than single documents, clustering vs. categorization, etc. Another good read: May issue of WiRED magazine has good coverage of vblogs and online video - the sort of grass-roots publishing that I like :-)

There is so much information to absorb and use for any type of knowledge worker that it takes time and effort to stay up to date with what we need to do our jobs. Much of my work involves writing custom software (usually layered on open source) for information management in specific industries/applications (large scale search, document categorization and repository maintenance, AI style data mining, agent technology to assist by bringing important things to user's attention, etc.) but I find it ironic that I can not seem to set aside the time to write much custom code for my own information needs (take care of customers first!). And, so far, it always seems to take custom code to solve specific information management problems. From what I have seen, there is not yet any silver bullet.

I have some ideas for exactly what tools I want for my own work flow and how I might "productize" them, but for now I have an adhoc system using subversion repositories, local directories organized by topic and augmented with local search, and using to organize bookmarks for material on the web. If I can set aside the time, I would like to integrate more of what I use in my own work flow.

What about search? Well, search is not information management. If/when semantic web technologies become more widely used, then software agents will be able to treat the web as an information source and be able to do research either without human intervention, or at least be valued assistants. CEOs of companies have well trained staff to filter and organize information - what will the effects be on society and the economy in the future when most people will have free or inexpensive software agents that can compete with well trained human staff? A nice thought but there will always be selective advantages to better information management systems.

Friday, April 28, 2006

Suggestion for Bush and Congress over what to do about gas prices

Apparently the "best government that corporate lobbyists can buy" is considering giving U.S. tax payers a $100 credit to help out with the higher cost of gasoline. A little reminder: these are the same "public servants" who just gave $5 billion in tax credits to the oil industry and a few years ago instituted huge rebates for people who bought very low gas mileage vehicles (here in Arizona the local news had clips of people lined up to buy hummers before the $10,000+ tax rebates went away).

OK, here is my advice:
  • Give tax credits for high gas mileage vehicles (Duh!!!)
  • Show a little honesty and decency and levy short term excess profits taxes on the energy industry - just take back a little of the congress-to-corporations give-aways
  • Institute a luxury tax on low gas mileage vehicles (Duh!!)
  • Outlaw no-passenger use of inefficient vehicles for a few time spans per day (for example: 8am-9am and 3pm-4pm) - make it hurt to be an energy abuser
Really, just use some common sense, and dig deep for a little bit of moral backbone (and, Congress members: why don't you stop taking those "exotic" vacations paid for by corporate lobbyists?)

Idea for average people: stop voting for any incumbents of either political party until our "public servants" clean up their acts and start to show some integrity. There are a few dozen responsible people in Congress - go ahead and keep voting for them.

Wednesday, April 26, 2006

Long term AI job: what language to use?

I have a pleasant decision to make: I am going to soon startup a long term consulting job and I have been given some flexibility in programming languages to use. I am fairly sure that I am going to use Ruby for all utilities for file handling, data conversion, etc. and use Common Lisp for the main AI component. I made a strong pitch for doing the web interface using Ruby on Rails, but my bit is the back end AI work, so I don't care too much what other people on the project use. I am still considering doing the AI part in Ruby also, but I am a little concerned about performance issues. I have also heard strong rumors that an awesome knowledge representation system PowerLoom that is available in Java, Lisp, and C++ might be open sourced. PowerLoom is a great framework - in the past I have only experimented with it because of its non-commercial license; if it is open sourced, I think that will change the landscape quite a bit for choosing frameworks for AI projects.

Sunday, April 23, 2006

Protecting the IT infrastructure and competitiveness of the U.S.

Our comfortable middle class lifestyle in the U.S. depends on our being competitive with countries with much lower costs of living. If Congress and the current Bush administration do not bend over backwards too far catering to the whims of their constituents (Big Business), I believe that we can be competitive in the technology sector. A few things that must be done (or not done):
  • U.S. companies and individuals must have access to a "full service" internet - no turning over control to a few large telcos
  • No encumbering of open source software - any legislation that impedes the use of open source software in the U.S. will simply put us behind countries without restrictive laws
  • By all means protect intellectual property, but do so in a fare and balanced way - in other words, don't bend over too far for big business
  • Our elected officials must remember that while open competitive markets are a good thing, for capitalism to work over the long term governments must prevent multi-national corporations from breaking laws and must not pass laws very much against public good, but in favor of multi-national corporations
Keep track of what your elected representatives do and let them know when they screw up, but also let them know when you think that they are doing the right thing. No matter how high the level of corruption is with foreign governments (you know the ones) and multi-national corporations making huge "contributions" to our elected representatives in Congress, these people do want to be re-elected.

Monday, April 17, 2006

Oracle making their own Linux distro?

Perhaps this is just the next logical step for the commoditization of operating systems! Application and tool developers offering a complete software stack so they can control everything, reducing customer support costs and generating more revenue by selling larger bundles.

BTW, about 4 or 5 years ago I was seriously considering making my own Linux distribution that would have contained all of my favorite programming (C/C++, Lisp, Prolog, etc.) languages and AI frameworks (Wordnet, all free Lisp tools, etc., etc.) all configured out of the box. The problem was that it would have been easy enough to do one time, but making up a new distribution every time the base Linux distro that I was using was updated would have been a long term time sink. Right now I use Ubuntu and it is not such a hassle for me to add in the stuff I like every 6 months when Ubuntu gets updated. When I finish my current book project, I might look into making a few AI oriented packages for Ubuntu to save other people a little time.

Sunday, April 16, 2006

Web application integration

I am talking about tight integration and not web mashups created by writing a web application wrapper for other people's web services. GMail and Google Calendar integration looks promising at this point, but we will have to wait and see how far it goes. Even though I wrote my own web based wordprocessing web app ( I am still a little skeptical about how widely used web based document processing will be.

Anything that challenges Microsoft Exchange/Outlook integration, but implemented as a web application looks really interesting to me. Because of privacy and IP protection concerns for corporations, I think that an appliance style product for combined search, email, and calendaring would be a great business idea (hey, anyone want to start a company :-)

I expect to see the costs (development, acquisition, downloading, deployment, maintenance, etc.) of infrastructure software to continue to be squeezed lower. Priced Exchange licenses lately? :-)

Saturday, April 15, 2006

Owning your own business

My brother was visiting us this week and we had time to hang out and talk. We both have our own businesses. Ron owns a couple of optometry offices in San Diego and I have my consulting business. Different businesses to be sure, but we both are in business not only to make some profit to support our families, but also we care about our customers. This probably sounds "corny", but never the less it is true. My brother has had many customers for over 20 years, and it bothers him to think of selling his practices to completely retire because of his long-term customers. I tend to do many small jobs, usually to get someone through a crunch, to help jump-start a new project or to perform some maintenance on an old project; I care about doing good work and for always giving flat-out honest opinions and advice. I manifest care for my customers by turning down work that is not in one of my fields of expertise.

Even though I made much more money working for large companies as an employee (partly through stock), it would be very difficult for me to now give up owning my own business. Certainly, being self-employed is not for everyone because of differing financial requirements and personalities. I am a reasonably public person because of the books that I have written and a fairly popular web site (try searching for 'Java consultant' or 'Ruby consultant') so I get emails, then telephone calls from people who want to quit working for a company and become self-employed; I don't like the idea of strongly affecting someone's career path, but I do share my experiences and typically ask people first how important these things are to them: flexibility in work schedules (+ your own business), steady income and reduced financial risk (+ employee), paid for benefits (+ employee), predictable work hours (+ employee), and being in control of your own life (+ your own business). I have a different kind of business since I live in the mountains in a small town (advantage: no commuting time; disadvantage: work limited to what can be done remotely) so I may not be of much help when talking with people who want to do on-site consulting. I eased into being self-employed over a long period of time, taking on off-hours consulting jobs while still working for a company. I basically transitioned from solving a few large problems for one employer to solving many small and medium problems for many customers.

I have been talking to you about practical issues of owning your own business. I really admire people who have higher ideals in starting companies to meet some social needs such as producing and distibuting organic food, supporting environmentaly low-impact lifes styles and products, etc. While I think that it is occasionally possible for large corporations to be socially aware, I think that smaller companies have a better opportunity to integrate into communities and service local needs.

While the risks involved in owning your own business are obvious, there is also some degree of stability once you can get started. You might not earn much money during times of economic downcycles, but then you will not lose your job! I think that there is a built in efficiency when you can decide why you want to be in business and what customers (or types of cutomers) you will service - this allows you to stay more focused.

Monday, April 10, 2006

Working backwards on the Semantic Web

I just had a bit of an insight: I think that many people, including myself, may be taking the wrong approach to working on the Semantic Web. I think that dealing with XML serializations of RDF, RDFS, and OWL is just plain wrong. Tools like Protege offer a frame-like UI that makes it a lot easier to work with (and free descriptive logic engines like Fact++ help by checking for consistencies). However...

I have had a little free time today to work on a pet project that Obie inspired: write Ruby wrapper code for making it easier to deal with RDF/RDFS/OWL by loading files and automatically mirroring classes, etc. I would work in Protege, then write Ruby code to consume the RDF/RDFS/OWL files so that I could work in a decent language. OK, fine.

However, this all still seems more than a little wrong to me. Since the Semantic Web is largely about ontologies and knowledge representation, why turn our backs on decades of AI research? Why not work with knowledge representation systems written in Lisp (or Prolog, Ruby, etc.) and have a back end that serializes to XML/RDF/RDFS/OWL as required. Really, use the best notation possible for all of the human-intensive work.

While Protege is a terrific tool, I still think that using older technologies like KEE, Loom, PowerLoom, etc. with optimal programming environments makes a lot of sense. Any language with good introspection (like Ruby or Common Lisp) would work for supporting XML serialization when required.

Disney-ABC start to get it

Broadcast TV is dead - let's remember it fondly :-) Seriously Disney-ABC's pilot program to allow free download and viewing of TV shows a day after they are first run makes sense. They will, I believe, use some form of DRM to insure that commercials can not be skipped, but that seems fair enough to me. I bet that only Windows based viewing is possible, but there is always some hope that at least Mac OS X will also be supported. There is a lot of competition for viewers' "eye balls" for all types of entertainment and media, so re-purposing material for free (with commercials) anytime viewing makes sense. I have been enjoying old 1950s Alfred Hitchcock TV shows, both purchased on iTunes music store and rented via Netflix - another good way to enjoy anytime viewing. I also have hopes of more budget/indy productions making it to free or low cost anytime viewing.

Saturday, April 08, 2006

What about your $27,000 debt?

How concerned should we be that the average American's share of the national debt is $27,000? Make that $108,000 for a family of four and we are talking real money.

So what? Isn't the debt as a fraction of GDP OK? The GDP is a horrible measure of economic health. A forest fire burns 1000 homes and the GDP goes up. Hurricane Katrina destroys a city and the GDP goes up. Our government spends irresponsibly and this excess is paid for by selling off more of our country to foreign investors - this increases the GDP!

Everything adds to the GDP and nothing reduces it: "The nations brightest economists maintain our national accounting system with a calculator that has a plus key but no minus key" (Lincoln Anderson on US GDP in The Fortune Encyclopedia of Economics)

The real short term risk of this debt is that interest rates will have to climb to keep attracting foreign investment to keep our society functioning and we are likely to get hit with increased inflation at the same time. Ouch and ouch.

Speaking of inflation, how many people understand that Clinton and Bush II removed things like the cost of food and housing from the cost of living indices (and thus the bogus government released "inflation" statistics)? I bet you don't hear about that much in the news - a news industry that does not serve public interests, but rather the interests of the mega-corporations who have bought up what used to be independent news companies, and who have a short term financial interest in convincing people to buy, buy, buy on consumer credit.

Sure, no one can predict the future, but there is one thing that I strongly believe: people who are now financing their lifestyle on increasing consumer debt are condemning themselves to a lifetime of economic servitude (greetings, indentured servants) while people who are handling their finances wisely will weather the coming financial storm with a minimum of pain.

Wednesday, April 05, 2006

Interesting: Bill Gate's work flow; knowledge management

I enjoyed this article by Gates on cnn on his personal work flow. The best part was on SharePoint. A few years ago one of my customers hired me to write a SharePoint clone that was tailored for their work flow - I used Java (JSPs, custom tag libraries, and Prevayler for persistence) in a fairly agile way but I would like to redo that project, with lessons learned, using Rails, and use AI text mining technologies to help automatically organize information (or suggest organizing hints).

Knowledge management is something that I am keenly interested in because it ties together lots of technologies that I am interested in: ontologies, knowledge representation, data constraints, server side technologies, and natural language processing with text mining, etc.