New possible leader among the leading cloud providers - Penguin

The competitive struggle with major suppliers of cloud HPC solutions, such as AWS, Azure, etc., for smaller players is a serious challenge. Industry giants have already shaped public opinion in their favor. But recently HPC-specializing Penguin Computing repeated the benchmarking study of its senior counterparts and claims that the results showed that their public cloud Penguin on Demand (POD) is among the leaders in cost and performance. Naturally, the fact that the service provider itself conducted the tests is alarming. However, the company claims that the tests accurately reproduce those that were covered in the aforementioned study, and that they are ready to provide third parties with the opportunity to independently repeat them for free.

Marketing reception? By itself. The Penguin study was a response to the article “A Comparative Study of Cloud Computing Vendors with High-Performance Linpack” by Exabyte.io and posted on arXiv.org a couple of months ago. But Penguin is not just trying to attract the attention of users and wedge between the giants Microsoft, Google, AWS, IBM / SoftLayer and Rackspace, whose proposals were reflected in the article. Penguin also demonstrates growing competitive zeal among many providers offering HPC cloud services.

While the Microsoft Azure study was the best among HPC providers, the Penguin study notes that Azure lost a leading position, while POD took the lead in both price and performance indicators. At the same time, Penguin itself took the leading positions in the number of gigaflops per core and per node in tests based on the Linpack library, as well as in acceleration rates when scaling.

To some extent, Penguin might be expected to outperform rivals, but POD is more likely to be characterized as an HPC on-demand cluster, rather than a regular public cloud, as it is a highly interconnected environment using the 100 Gb / s OmniPath connection. If you omit some nuances, the main advantage of POD, to be precise, is that it is cheaper than the services of the main cloud players.

HPCwire has already covered this (see Azure Edges AWS in the Linpack Benchmark study ). However, there is better performance (including compared to NERSC) showed Azure. The main purpose of the study was to find out whether the workload of HPC tasks can run in a cloudy environment and be cost-effective? Using the high-performance library Linpack as a tool for measuring performance and comparing costs with traditional infrastructure has shown that it definitely can!

“We tested the performance of the best available computing platforms from public cloud providers using the high-performance library Linpack. We optimized the test for each computing environment and evaluated the relative performance for computing distributed memory ... The conclusions are: the concept of high-performance cloud computing is ready for widespread adoption, can provide a viable and cost-effective alternative to capital-intensive local equipment deployment, ”write the authors Mohammad Mohammadi and Timur Bazhirov from Exabyte .io.

Penguin, of course, is well known for its HPC expertise and Tundra servers. In November, the company had seven systems in the Top500. Not surprisingly, the POD offer is focused on HPC. “We recognize the need for high-performance nodes with modern processors, platforms, non-blocking switching matrices for the infrastructure that you yourself would have designed if you were building a highly scalable HPC cluster at home,” says Victor Gregorio, senior vice president of cloud computing for POD .

POD provides two cloud computing locations, MT1 and MT2, which can be accessed through a single portal. Each location has localized storage and a high-speed connection between sites to ensure easy data migration from one location to another.

Entry nodes and storage volumes are located in each location, but the registration data are common to all POD locations, as well as portal usage reports.

Testing was conducted in the MT2 data center at B30 nodes on Intel Broadwell E5-2680 v4 processors (2.4 GHz, 14 cores, 16 cycles with double precision), providing a maximum of 1.07 teraflops per node. The MT2 location uses an Intel OmniPath connection. Other clouds in this study currently operate without a network of equally high-performance network connections, although Azure nodes (AZ-A and AZ-H) and NERSC Edison use high-speed connections (40Gbs InfiniBand and Cray Aries, respectively) - detailed characteristics of various node architectures / processors are best reflected in the article exabyte.io.

As shown in the table and in the graphs below, Penguin confirms that POD works on a par with the best of the major providers, and that its prices are lower - at least as shown in the study.

In this comparison, POD is resolutely breaking out at the very top, demonstrating good scalability with an increase in the number of nodes / cores. The cost comparison is also incommensurable, although POD has a significantly higher initial payment per node, $ 2.80 per hour versus the highest cost of the top Azure IB-A $ 1.90 per node per hour. It is worth taking into account the fact that Pinguin compares publicly disclosed tariffs of competitors, and does not take into account the factor of potential discounts.

According to the company, the main reasons for cost reduction are higher productivity, reduced paid work time, accurate measurement of it for up to three seconds compared with rounding up to the used hour for other players, as well as the absence of many common additional fees, such as data transfer charges , bandwidth, tuning, etc.

“You will not be billed for downtime, only for the time spent on calculations. This is despite the fact that other providers of cloud solutions will round up the operating time to an hour, ”the study says. Work on a task that has been completed for 90 minutes will cost as much as 61 minutes or 62 minutes; in the accounts, time will be rounded off. It is noted that the time spent on most tasks is weakly correlated with such units as whole hours. Therefore, Pinguin measures equipment operation for up to three seconds. “Speaking of the real state of affairs, we are clearly more cost-effective than other cloud providers for HPC,” says Gregorio.

There are currently five different queues in the POD, each pointing to a cluster with different resources and capabilities. The less computing power, the lower the cost. However, Grigorio argues that all of them are at the level of serious workloads for HPC.

Penguin does not specifically cover the size of the POD client base. Sid Mair, SVP, Federal Systems, emphasizes that POD is the fastest growing part of the company, and that Penguin customers switching to local equipment and services are rare. One POD client performs several thousand key tasks, he says, while there is a university in which 300 students solve “tiny” tasks every day. Weather forecasts, automotive, traditional engineering disciplines ... Gregorio considers a serious problem to be solved on 4000 cores.

“Almost every HPC application on the market is powered by POD, and many of them are already uploaded to the environment we use.” We often find that customers transfer their corporate licenses directly to POD instead of worrying about managing them, ”says Gregorio, adding that the list of applications on the website lags slightly behind the current, in which applications and tools are even more . However, most of the familiar names are already here: ANSYS, Dassult Systems and MathWorks.

Perhaps unexpectedly, the offer of nodes with a graphics accelerator (NVIDIA K40) is more modest compared to the aggressively implemented main cloud players devices based on K80 and P100. Gregorio says that Pinguin monitors demand and is able to scale as needed.

“As soon as we see demand for such things as“ deep learning ”in the market, we introduce it. We are currently working with a large number of customers using “deep learning” technology in local environments, and we use this experience to optimize the cloud environment for these tasks. We are not yet ready to publicly disclose our plans, ”says Gregorio.

Source: https://habr.com/ru/post/328384/

All Articles

New possible leader among the leading cloud providers - Penguin

More articles: