Great talk on Spark
I just listened to an ACM sponsored talk Making Big Data Processing Simple With Spark by Matei Zaharias. You may need to be an ACM member to watch the webinar. I first joined ACM in the mid 1970s - recommended. For handling huge datasets Spark is evolutionary or revolutionary depending on your point of view. A bit of personal history before I talk specifically about Spark: In the late 1980s I was an architect and developer on a multinational project to use seismic data from 38 data collection stations to detect atomic bomb tests. All of our data handling software was custom; if we had Spark, or even Hadoop, we would have saved a ton of effort. Similarly, in the 1990s I was tech lead on a fraud detection system that used massive real time telephone records data sets. Modern infrastructure would have saved a lot of time and money. My first serious use of map reduce was processing large Twitter data sets at Compass Labs. We used Hadoop on Amazon ElasticMapreduce. Later wh...