Tuesday, November 20, 2012

Hama - a Bulk Synchronous Parallel computing framework on top of Hadoop

Hama - a Bulk Synchronous Parallel computing framework on top of Hadoop: "Why Hama and BSP?
Today, many practical data processing applications require a more flexible programming abstraction model that is compatible to run on highly scalable and massive data systems (e.g., HDFS, HBase, etc). A message passing paradigm beyond Map-Reduce framework would increase its flexibility in its communication capability. Bulk Synchronous Parallel (BSP) model fills the bill appropriately. Some of its significant advantages over MapReduce and MPI are:

Supports message passing paradigm style of application development
Provides a flexible, simple, and easy-to-use small APIs
Enables to perform better than MPI for communication-intensive applications
Guarantees impossibility of deadlocks or collisions in the communication mechanisms"

'via Blog this'