📜 ⬆️ ⬇️

Supercomputer for 10 petaflops for MSU


Supercomputer "Lomonosov"

T-Platforms has contracted with Moscow State University to design a computing cluster with a peak performance of 10 Pflops (10 15 floating point operations per second). This system will be one of the most powerful in the world. To date, the Japanese supercomputer K leads the Top500 list, with a maximum performance of 10.51 Pflops (peak - 11.28 Pflops), in the second place is the Chinese "Tianhe-1A" (2.57 / 4.7 Pflops).

Under the terms of the agreement with MSU, T-Platforms will develop the supercomputer itself, as well as a complex of engineering infrastructure that ensures uninterrupted operation. It is said that the system will be built on the hybrid CPU + GPU architecture on the new platform, other characteristics are not yet named.

For reference, the “Lomonosov” supercomputer has no equal in the world in terms of computational density: it was possible to place the computing system on an area of ​​only 252 m 2 . In this case, the computer consumes no more than 2.8 MW of electricity ( specifications in PDF ).
')
The supercomputer uses four types of compute nodes and processors with different architectures, a total of 5,100 compute nodes of the x86 architecture and 777 compute nodes based on the nVidia GPU. It was the first hybrid supercomputer of this scale in Russia and Eastern Europe.

The TB2 platform for Lomonosov was designed by T-Platform engineers from scratch - all boards and mechanical components are the company's own developments. The basis of the TB2 solution is a computational module in which a 14-layer original design motherboard with four Intel Xeon 55xx or 56xx series processors, four three-channel DDR3 memory modules of its own design and integrated QDR InfiniBand network controllers is installed.


Motherboard

The design of the switches of the system network is made on the basis of the reference design of Mellanox InfiniScale IV.


System Network Switches

Two switches integrated in the rear section of the chassis provide 1.6 Tbit / s system network bandwidth, they have 32 internal ports for two to connect all compute nodes and 40 external ports, 6 of which are used to connect storage systems via InfiniBand or creating heterogeneous computing systems — for example, with PowerxCell 8i or NVIDIA GPGPU nodes.

Each memory module integrates the functionality of three DIMM modules and is inserted horizontally into the motherboard.


Memory module

The control module includes four functional units that monitor and control the system, integrate the control and auxiliary networks of the 10GbE / Ethernet standard, integrate specialized barrier synchronization and global interrupt networks, and the external frequency synchronization network of the computing nodes. Specialized networks reduce the delays that occur when synchronizing the execution of parallel operations on large installations. These networks are controlled by a specially programmed FPGA chip.


Control module

The 24-ply backplane combines all the chassis subsystems: compute modules, power and cooling systems, network interfaces, and a control system.


Backplane

Each motherboard emits about 570 watts of heat and requires efficient cooling. The optimal version of the design of the radiator was found thanks to the simulation on a supercomputer with a capacity of 10 TFlops.


Cooling system plan and radiator

Composite aluminum radiator with copper inserts completely closes the motherboard and provides air cooling of the blade system. The use of lightweight aluminum reduced the chassis weight to 153 kg.

This is what the chassis itself looks like, designed to be installed in standard 19 "cabinets."


Chassis

The new supercomputer for 10 petaflops will provide an adequate supply of performance for resource-intensive computing, which is conducted by scientists of Moscow State University in the aerospace, atomic, biomedical, oil and gas and other scientific fields.

Video tour of the operating room "Lomonosov" (before the upgrade)


On this topic:
Supercomputers: Third World Race

Source: https://habr.com/ru/post/135384/


All Articles