Friday, October 05, 2012

Tutorial — Disco v0.4.3 documentation

Tutorial — Disco v0.4.3 documentation: "from disco.core import Job, result_iterator

def map(line, params):
for word in line.split():
yield word, 1

def reduce(iter, params):
from disco.util import kvgroup
for word, counts in kvgroup(sorted(iter)):
yield word, sum(counts)

if __name__ == '__main__':
job = Job().run(input=["http://discoproject.org/media/text/chekhov.txt"],
map=map,
reduce=reduce)
for word, count in result_iterator(job.wait(show=True)):
print word, count"

'via Blog this'