Posts

Showing posts from September, 2014

I was accepted into the Microsoft BizSpark program

Since I winding down my consulting business this year (that means that I am limiting myself to a maximum of about 10 hours a week working for consulting customers) I have spent a lot of time getting better at developing in Haskell, reviewing what I hopefully already know about machine learning, and taking classes. In other words, I want to work on my own stuff :-) I have had an idea for starting a small business and a while ago I applied to the Microsoft BizSpark program . I was just accepted into the program a few days ago. Using my own business idea as my yardstick, Microsoft is taking long term bets with BisSpark. It costs them money and resources to support the development of new business ideas, but the long tail is many years of selling infrastructure services. Even though there is not much lock-in using Microsoft Azure I am absolutely personally committed to using Azure long term if my idea works: Microsoft is providing up to $150/month of free Azure services for up to three y...

I pushed a NLP demo to IBM's PaaS service BlueMix

The demo processes news stories to summarize them and map entities found in the text to DBPedia URIs. The Ruby code is similar in functionality to the open source Haskell NLP code on my github account. Some background: I have been helping a customer integrate the IBM Watson AI platform into his system. I noticed on Hacker News this morning that IBM's PaaS service BlueMix will very soon offer a sandbox for IBM Watson services. I signed up for BlueMix to have an opportunity to get more experience using IBM Watson. I just spent an hour putting together a quick NLP demo that uses my own entity detection code and the Ruby classification gem which supports pretty good summarization. Give it a try :-) 2014/09/29 update: I stopped this quick demo I put together - is is simple and was just to experiment with BlueMix. A better demo is my KBSPortal.com site. BlueMix is built using Cloud Foundry so if you are already familiar with the Cloud Foundry command line tools then you will f...

Setting up "Heroku like" git push deploys on a VPS is so easy

I was reading about Docker closing a $40M series C round this morning. While containerization is extremely useful at large scale, I think that the vast majority of individual developers and small teams write many web applications that don't need to scale beyond a beefed up VPS or single physical server. For a good developer experience it is difficult to beat a slightly expensive but convenient PaaS like Heroku. However, if you have many small web app projects and experiments then hosting on a PaaS and paying $30-$50/month per application can add up, year after year. If you need failover and scalability, then paying for a PaaS or implementing a more failsafe system on AWS makes sense. For experimental projects that don't need close to 100% uptime, I set up a .git/hooks/post-commit git hook like this: ./rsync.sh ssh mark@myappname.com 'bash -s' I have my DNS setup for myappname.com (this is not a real domain, I am using it as an example) and all other domains for my ex...

Changed license on my Haskell NLP code and comments on IBM Watson AI system

When I added licensing information on the github repository for my Haskell NLP experiments I specified the AGPL v3 license. I just changed it to GPL v3 so now it can be used as a web service without affecting the rest of any system that you use it for. I also did some code cleanup this morning. In addition to the natural language processing code, this repository also contains some example SPARQL client code and my Open Calais client library that you might find useful. Some news about IBM Watson: their developer web site now has more documentation and example code available without needing to register to become an IBM Watson Partner. I am helping a long term customer use IBM Watson as a web service over the next several months so I registered as a partner and have been enjoying reading all of the documentation on training an instance for a specific application, the REST APIs, etc. Good stuff, and I think IBM may grow a huge business around Watson.

I am open sourcing my Haskell NLP experiments

I just switched the github repository for my NLP experiments to be a public repository. Git pull requests will be appreciated! The code and data is released under the AGPL version 3 license - if you improve the code I want you to share the improvements with me and others :-) This is just experimental code but hopefully some people may find it useful. My latest changes involve trying to use DBPedia URIs as identifiers for entities detected in text. Simple stuff, but it is a start.