One of Yahoo’s Hadoop clusters sorted 1 terabyte of data in 209 seconds, which beat the previous record of 297 seconds in the annual general purpose (daytona) terabyte sort benchmark. The sort benchmark, which was created in 1998 by Jim Gray, specifies the input data (10 billion 100 byte records), which must be completely sorted and written to disk. This is the first time that either a Java or an open source program has won. Yahoo is both the largest user of Hadoop with 13,000+ nodes running hundreds of thousands of jobs a month and the largest contributor, although non-Yahoo usageand contributions are increasing rapidly.
Hadoop is a software platform that lets one easily write and run applications that process vast amounts of data.
Hadoop implements MapReduce, using the Hadoop Distributed File System (HDFS).