Don't use Hadoop - your data isn't that big: "Hadoop << SQL, Python Scripts
In terms of expressing your computations, Hadoop is strictly inferior to SQL. There is no computation you can write in Hadoop which you cannot write more easily in either SQL, or with a simple Python script that scans your files."
'via Blog this'