Wednesday, November 28, 2012

Writing An Hadoop MapReduce Program In Python @ Michael G. Noll

Writing An Hadoop MapReduce Program In Python @ Michael G. Noll: "Precisely, we compute the sum of a word’s occurrences, e.g. (“foo”, 4), only if by chance the same word (“foo”) appears multiple times in succession. In the majority of cases, however, we let the Hadoop group the (key, value) pairs between the Map and the Reduce step because Hadoop is more efficient in this regard than our simple Python scripts."

'via Blog this'