📜 ⬆️ ⬇️

From philosophy to science: how Stephen Hawking and the COSMOS center will use HPE Superdome Flex

“Without supercomputers, we are just philosophers,” noted Professor Stephen Hawking in this video , and for good reason. Twenty years ago, Stephen Hawking founded the COSMOS research team at the Faculty of Mathematics at the University of Cambridge. The group used high-performance in-memory computing to investigate cosmology, astrophysics, and particle physics. Access to new data sets has allowed cosmology to evolve from speculative theory to science, based on calculations.



New impetus to the development of cosmology was given by recent discoveries, primarily the discovery of gravitational waves in 2016. This is what Stephen Hawking said about this: “The COSMOS group is working on understanding how space and time are functioning, from the first trillionth of a second after the Big Bang to today. The recent discovery of gravitational waves offers us amazing knowledge about black holes and the entire universe. With the emergence of such interesting new data, we need flexible and powerful computer systems that can handle them. ” The faculty received just such a system based on the HPE Superdome Flex in November.

The new, tenth high-performance COSMOS system, created in partnership with HPE, will allow processing data files of previously unavailable scale in real time. Thanks to this, it will be possible to supplement the theoretical foundations of cosmology, which already use such sources as relic radiation, the distribution of stars and galaxies, with new data about gravitational waves.
')
The new supercomputer is based on our new product - HPE Superdome Flex , supplemented by the HPE Apollo supercomputer and systems with Intel Xeon Phi. HPE Superdome Flex is uniquely suited to the needs of the Faculty of Mathematics at the University of Cambridge: it is the most scalable in-memory computing platform (with shared RAM from 768 GB to 48 TB, up to 32 sockets within a single computing system). Such power will allow processing large amounts of data in parallel and getting more and more accurate results, faster testing of new concepts and algorithms. In addition to the tasks of cosmology, the new system will support faculty research in other areas, from modeling in oceanology to experimental biophysics. And now tell you more about the platform.

HPE Superdome Flex


The HPE Superdome Flex platform is the first joint HPE and SGI product that combines the best practices in the Superdome X and MC990X (formerly SGI UV300) areas. From the Superdome X, it inherited the proven resiliency and high availability properties, and from SGI, the technology of virtually unlimited vertical scaling. HPE Superdome Flex appeared in our portfolio of business-critical High-End systems at the end of 2017. At the same time, its economic efficiency is approaching the level of standard x86 systems.

Architecture and Scaling


A distinctive feature of the system is a modular design based on 4-processor standard units installed in a rack and integrated into a high-speed switched matrix. This allows you to create a more accessible minimum configuration for a high-end system, eliminating redundancy in the initial stage. As the load grows, resources can be increased gradually up to 32 sockets, with a step of 4 sockets, and up to 48TB of RAM. Computational power can be added without replacing the existing equipment; the system can be scaled both vertically and horizontally.


The figure below shows the architecture of the HPE Superdome Flex Base Chassis, which is a modular unit for a server solution. Each HPE Superdome Flex system consists of at least one base chassis plus up to seven additional expansion chassis (Expansion Chassis), which provides the ability to scale up to 32 sockets or divide the system into hardware-independent partitions (HPE nPars) to isolate workloads and / or combining multiple workloads on a single managed complex. Specialized HPE Superdome Flex ASIC integrated circuits connect the main chassis and the expansion chassis with each other using cables to the ultra-fast Superdome Flex Grid switching fabric.


HPE Superdome Flex Base Chassis Architecture

A detailed overview of the platform architecture is contained in the HPE Superdome Flex server architecture and RAS document, which we will soon publish in Russian.

Ras


Separately, we note the resiliency and availability of the HPE Superdome Flex platform - at the level of 99.999%, which is unique to the world of x86 systems. This indicator is achieved through advanced RAS properties and a comprehensive failure management strategy, which provides for predictive detection, registration, failure analysis, and in cases where this is possible, self-healing without the involvement of an operator. Firmware First’s fault detection mechanism handles firmware level firmware errors.
Another RAS feature in Superdome Flex is the ability to organize electrically isolated hardware-independent partitions, which increases the security and availability of individual independent applications within a single hardware platform.

Usage scenarios


Scaling capabilities, Superdome Flex Grid's ultra-fast switching fabric with low latency access to a shared global memory pool, powerful I / O subsystem make HPE Superdome Flex the best in-memory platform for high performance in-memory computing that combines transaction load and analysis of large amounts of actual data in real time. Main scenarios for using the HPE Superdome Flex:


Platform for memory-driven computing


HPE Superdome Flex was designed for memory-driven computing (memory-oriented computing, not processor; see our article ). The blocks in the Superdome Flex are connected using a high-performance switching fabric, through which the operating system accesses memory in all the blocks. Typically, the system runs under the control of a single instance of the OS, as a single calculator.
It is also possible to create several partitions in the system and in each of them to launch its own copy of the OS. At the same time, different partitions will be able to have access to all memory through the switching matrix, even outside of their OS. Each section of the system can provide a part of its memory to a common pool, and this shared resource can be used by applications in different operating systems.



Thanks to the Superdome Flex Grid switching matrix, with minor changes in the operating system, you can force the application to believe that it is already running on The Machine (within the installed 32 sockets and 48 TB of RAM), and thus provide the opportunity to realize the benefits of Memory-Driven Computing now.

We will tell you more about the platform at two events:

"HPE Tech-Talk: let's talk about technology," online broadcast January 30
Webinar: HPE Superdome Flex - New platform for highly critical tasks and in-memory calculations, February 14th.

Source: https://habr.com/ru/post/347506/


All Articles