Showing posts from October, 2009

Using nailgun for faster JRuby startup

I finally got around to trying nailgun tonight. On OS X with JRuby 1.4.0RC2, I built nailgun using: cd JRUBY_HOME/tool/nailgun ./configure make # I ignored the warning "no debug symbols in executable (-arch x86_64)" In one terminal window just leave a nailgun server running: $ jruby --ng-server NGServer started on all interfaces, port 2113. When you want to run JRuby as a railgun client, try something like: jruby --ng text-resource.rb On my MacBook, this cuts about 5 seconds of JRuby startup time off of running this test program. Sweet. For small programs, using ruby is still faster than jruby but this makes developing with JRuby faster.

I just tried Amazon's new Relational Database Service (RDS)

Amazon just released a beta of their Relational Database Service (RDS). You pay by the EC2 instance hour, about the same cost as a plain EC2, but about $0.01/hour more for a small instance, plus some storage costs, and bandwidth costs if you access the database outside of an Amazon availability zone. RDS MyQL compatible (version 5.1) and is automatically monitored, restarted, and backed up. Currently, there is no master slave replication, but this is being worked on (RDS beta just started today). Here are my notes on my first use of RDS: Install the RDS command line tools rds-create-db-instance --db-instance-identifier marktesting123 --allocated-storage 5 --db-instance-class db.m1.small --engine MySQL5.1 --master-username marktesting123 --master-user-password markpasstesting123 Wait a few minutes and see if the RDS instance is ready: rds-describe-db-instances Open up ports for external access, if required (note, here I am opening up for world wide access just for this test): rds-au

Securing your Mac laptop

Laptops get lost and stolen a lot. I am extra careful with my laptop because I keep so much of my and my customer's private data on it. I take a few steps to protect this information that I want to share with you (Mac OS X specific): I keep a small encrypted disk image that contains all my passwords and other sensitive information. It also contains my .ec2, .s3cfg, .profile, .ssh, .gnupg, and .heroku files. Then in my home directory I make soft links ln -s ... to these files. I do not keep the password for this disk image in my OS X keychain! It is a very small hassle: each time I boot up, I mount this image so my .ssh, etc. files are available. This adds 10 seconds of "overhead" to each time I boot my laptop. Whenever I start working for a new customer, I ask them if they would like me to also keep their working materials encrypted (some overhead involed, so I like to ask them if I should spend the time doing this). Update: a reader pointed out that this is

More getting stuff done by doing what I most want to do experiments

I read an interesting article a few weeks ago (sorry, no attribution - can't find the article again) about trying to always do what you want to be doing. I used to do "round robin" style scheduling of my time: keeping a single to-do list and cycling through it (and sometimes just finishing small tasks outright). I have always thought that I needed to apply some meta-level discipline to get tasks that I don't enjoy as much done in a timely way. Scheduling work is not so difficult because I usually have just 3 or 4 active customers, and I enjoy most of my work. Other things like yard work (I prefer new projects over maintenance) got the round-robin treatment, and even recreation (I like to hike, cook/eat, read, and watch movies) activities used to be scheduled round-robin style to a (very) small degree. Lately, I have been experimenting with not doing any meta-level scheduling. Now when I finish an activity I start the new activity that is what I most want to do. The

RDF datastores are noSQL also - always keep an RDF data store service running

We tend not to use things that are not "ready at hand." RDF datastores are noSQL also :-) I always keep Sesame running as a service just as I run PostgreSQL and MySQL services. Some things are better stored, queried, and maintained in a graph database. If you always have something like Sesame (or the free edition of AllegroGraph ) running as a service, and if you have client libraries installed for your favorite programming languages then it is easier to quickly choose the best data store for any given task. BTW, I also always keep a CouchDB service running.

Cloud computing options and portability

I listened to Paul Miller's podcast with Rackspace's president of their Cloud Division Lew Moorman this morning. I mostly agree with his comments on easy portability between Rackspace cloud services and Amazon's EC2. I have not yet used Rackspace's cloud offerings, so my comments here are based on their documentation and a conversation I had with one of their support engineers (for one of my steady customers: I declined some work tasks to move to Rackspace because I don't like to spread myself too thin: I spend a lot of effort staying up to speed on Amazon and AppEngine, so I prefer to specialize on those two deployment platforms). The advantage of Rackspace is the binding of a persistent disk volume with their virtualized server instances (really, they offer a standard sort of VPS hosting service) where with Amazon it takes a little extra work to manage EBS volumes separately. For me, I like the benefit of Amazon's SQS, S3, and Elastic MapReduce - that said,

Switching an AppEngine project from JRuby+Sinatra to Java+JSP

It is a bit of a pain to take several hours to convert a working codebase in one language/platform to another. I kept having small problems with JRuby and Sinatra that were just AppEngine specific (Ruby (or JRuby) and Sinatra are awesome). I am only about 20% into development, and I decided that I wanted really solid tools/platform. Also, converting working code in one language to another is simple. What convinced me to make the switch is that Java + Eclipse plugins support is just so good for AppEngine development, that for now the change seems like a good decision. For my next AppEngine project, I'll probably go back to JRuby + Sinatra since the support is getting better.

I built the open source IDEA 9.0 git snapshot - works fine

Something to do while watching TV :-) With the Apache 2.0 license, it will be interesting to see how it is used. It is a large git clone, but built easily using ant. I get a free commercial license for IntelliJ IDEA (as I used to get free Enterprise JBuilder licenses from Borland) but I still plan on following the open source IDEA project - hopefully interesting things will happen! I use Eclipse a lot just because the Java AppEngine support is so very good, but for plain old Java coding, I like IDEA. The open source edition of IDEA is great for plain old Java coding, BTW, but is missing JSP + Tomcat development support (but NetBeans does a good job for J2EE-- development, and who does J2EE development anymore :-) It takes a while to do a git clone and build the IDE (builds versions for OS X, Windows, and Linux as the default ant build target) and since the build process is so easy, it was not much fun, so you might as well just download a built version for your OS platform if you

Nice tool for writing and maintaining documentation: YMUL web service and yumlcmd Ruby gem

Although some customers request using a Word Processor for producing documentation, if it is my choice I like Latex and OmniGraffle for producing diagrams. Latex is the fastest tool (that I use) for producing great looking print or PDF documents. I am experimenting with something else this morning: the YUML web app for creating UML diagrams and the yumlcmd Ruby gem (add to your gem source and then gem install yumlcmd ). Thanks to Under the Hat for pointing these tools out - check their blog for directions. Although YUML hardly replaces OmniGraffle, it is cool to have documentation text based (Latex files and YUML files): faster, and less work.

Some frustration with JRuby + Rails on Google AppEngine

A few engineers at Google and other developers are doing some good work towards getting Rails running on AppEngine both robustly and in a way that provides a good local development environment. One problem is simply that if your web app is not active, initializing JRuby + Rails + and all required gems can time out (30 second window for handling requests). The Java and Python support for AppEngine is fantastic, but for two projects I want to do (my own projects, but may be revenue generating :-) I want a more agile programming language that Java and while my Python skills are sort-of OK, my knowledge of Django is very light. I should probably just bite the bullet and spin up on Django, but I would strongly prefer working in Ruby. I have been experimenting with the JRuby + Sinatra + ERB + datamapper combination and at least an inactive web application spins up well within the 30 second request timeout window. I very much like datamapper (object identity issues) and it should not be too

Designing for scalability and platform portability

Once an application is designed and at least partially implemented, options for scalability and portability are reduced. If a system's usage profile can not be predicted, then deploying to physical servers is a real problem because you have to pay for support for peak usage periods - however, relying on cloud infrastructure can very much limit platform portability. It helps to consider scalability up front! Relying on scalable data store infrastructure like Googles AppEngine datastore or Amazon's SimpleDB can make life easier. For server side Java, coding to JPA makes it possible with some work to be portable between AppEngine datastore, SimpleDB, or using a traditional database on your own server. Some care needs to be taken to code to a subset of JPA (e.g., no cross domain queries in SimpleDB) if portability is important. In the Rails world, using Datamapper provides similar flexibility for portability between AppEngine datastore, SimpleDB, or a conventional database. And, t

Storing Lucene indices in Cassandra; cloud versus running your own server farm

The Lucandra project looks very interesting, but is incomplete at this time (see the "to be dones" at the bottom of the linked page). Cassandra is a great project. I almost incorporated it into the design of a customer project recently, but we decided to host on Amazon so using their EC2, S3, SQS, and Electric Map Reduce services won out over rolling a custom stack. I think that this must be a start up dilemma: long term, it is probably least expensive running one's own small server farm, but when you are just getting started a "pay as you go" cloud approach using very solid infrastructure tools like EC2, S3, SQS, SimpleDB, etc. makes sense. I can't say this from personal experience, but my gut feeling is that if you can live within the constraints of Google's AppEngine, then it is probably less expensive using AppEngine than running your own server farm - even long term. BTW, if you have not read my DevX article on implementing search on the Java

Interesting new book: "Networks, Crowds, and Markets: Reasoning about a Highly Connected World"

This book will be published in 2010 but a complete pre-publication draft is available here . There is a PDF download link for the entire book near the top of the page. If you enjoyed reading Albert-László Barabási's classic book "Linked: The New Science of Networks" then this new book looks like a great followup. I have not got too far into David Easley's and Jon Kleinberg's new book yet, but the range of topics in this 800 page book looks like it will make a good read.

Using Facebook Connect just got a lot easier

A new set of tools makes it much easier to integrate Facebook Connect in your web sites. Good job Facebook. Their old APIs and support were somewhat difficult to work with. I would have added another example to my last book (about Web 3.0 stuff) if this had been available a few months ago.