📜 ⬆️ ⬇️

Microsoft Research Team Breaks World Sorting Record

The sortbenchmark.org site hosts contests for sorting large data sets annually. One type of competition is the minute sort, in which you need to read from the disk in a minute and sort as many records as possible and save the result to a file. The competition is held in two categories - Indy, with no restrictions on the hardware used, and Daytona - only ordinary “store” computers should be used.

The Microsoft Research team managed to repeatedly exceed the Yahoo record held in 2009 in the Daytona category. Their cluster consisting of 1033 disks on 250 machines coped with 1401 gigabytes of data. This is almost three times better than the result of Yahoo (500 gigabytes), despite the fact that the Yahoo cluster was almost six times larger (5624 disks on 1406 machines). Moreover, the Microsoft cluster also broke last year’s record in the Indy category (1353 gigabytes).

Such impressive results were achieved thanks to the Flat Datacenter Storage (FDS) technology. Microsoft did not use the typical MapReduce paradigm-based solutions. For some tasks, and sorting is one of them, it is impossible to process parts of the data independently of each other on different nodes, as is done in MapReduce solutions. From the need to move huge amounts of data can not escape.

FDS technology takes advantage of the fact that since the creation of the MapReduce architecture, networks have become much faster and cheaper. This allowed us to build a cluster in which each computer is able to communicate with any other simultaneously at the full speed of its network interface (this network is called full bisection bandwidth network). Thus, instead of the Hadoop infrastructure that Yahoo used in 2009, the Microsoft Research team used a network file system that allows you to access any data on any node as if they were on a local disk.
')
Microsoft plans to use the FDS architecture in data centers serving the Bing search engine.

Source: https://habr.com/ru/post/144295/


All Articles