Thursday, April 28, 2011

Pigs, Bees, and Elephants: A Comparison of Eight MapReduce Languages « Dataspora

Pigs, Bees, and Elephants: A Comparison of Eight MapReduce Languages « Dataspora: "I will present for each language or library the implementation of a word count program, lifted from its documentation, since this has become sort of the “Hello World” for map reduce. I don’t think such a simple program is the ultimate test of the quality of a language, so this is just to give a taste of the language. What I am most interested in is:

Can I write reasonably concise, abstract programs in this language or library?
Can I write the “inside” of map reduce, that is the code for the mapper and the reducer, as well as the “outside”, the logic that decides which map reduce jobs to run?
Is it general? Can I write any map-reduce program, including programs that require multiple map-reduce jobs, including the case of a data dependent number and type of jobs?"