Hadoop for the research users, a quick time-line of how things have been progressed right from the year 2004 to till date. A quick timeline of hadoop is discussed below.
Initial versions of what is now HDFS (Hadoop Distributed File System) and MapReduce implemented by Doug Cutting and Mike Cafarella.
Doug Cutting joins Yahoo! grid team.
Hadoop becomes Apache top level project to support the standalone development of MapReduce and HDFS. Adaption of Hadoop by Yahoo! grid team.
Sort benchmark (10GB/node ) run on 188 nodes in 47.9 hours.
Sort benchmark run on 500 nodes in 42 hours with better hardware than April benchmark.
Research cluster reaches 600 nodes.
Sort benchmark run on 20 nodes in 1.8 hours, 100 nodes in 3.3 hours.
Research cluster reaches 900 nodes.
2-Research clusters with reaching each of 1000nodes.
Won the 1 terabyte sort benchmark in 209 seconds on 900 nodes.
Loading 10 terabytes of data per day on to research clusters.
17 clusters with a total of 24,000 nodes.
Won the minute sort by sorting 500 GB in 59 seconds on 1,400 nodes.