📜 ⬆️ ⬇️

FUJITSU against all or Japanese killer RISC servers

Hail Colleagues!

By writing this article, I was encouraged by a conversation with a colleague (a good engineer and a specialist in their field), in which he mentioned that the x86 server market was “exhausted” - all manufacturers copy each other, all servers are the same, and no one offers anything new. At the same time, the market of RISC \ UNIX servers, by virtue of its smaller sizes, also imperceptibly changed for the mass consumer of servers:
• Oracle, c inherited from SUN SPARC systems, relies on the Exa family built on the x86 platform.
• Intel has long ceased to develop the direction of Itanium
• HP prosecuted Oracle to support Itanium Super DOME servers ……
• IBM is proactive in consistently developing the IBM POWER processor family. But still, IBM does not do it as intensely as Intel and doubles its performance every three years. The result is a 4-socket Intel x86 machine with Intel Xeon E7 v2 processors ahead of Oracle 10g in OLTP, similar to the Power750 based on four PowerP7 + by 25-30% ... But for some reason no one throws IBM Power into the trash and runs headlong for x86 servers?

The answer is simple - fault tolerance.
Traditionally, it was the business that grew up to reduce unplanned downtime and minimize data loss and recovery time. RISC \ Unix servers responded to this task, but moving away from the x86 platform required substantial financial injections. The servers themselves were more expensive, had an annual mandatory payment for a subscription to the service (the so-called maintenance), had a closed proprietary architecture (each manufacturer has its own). And another set of applications (for example, MS SQL does not work on RISC machines — you need to migrate to DB2 or Oracle) and the need for specially trained people able to manage and maintain such servers ... All this created difficulties and was reflected in the final budget, and realizing this, many Customers remained on x86.


')
Another way went Fujitsu. Using its many years of experience in designing fault-tolerant servers, the company released its first server in 1958 and delivered the AMDAHL 470V / 6 mainframe in 1978 to the US National Aeronautics and Space Administration (NASA).



In 1995, the company began to produce Sparc64 processors and systems built on these processors.
In 2002, the fastest GS21-600 mainframe was released ...
Later, there were various models on SPARC processors and a family of PRIMEQUEST began to emerge, machines using x86 architecture and meeting all the requirements for fault-tolerant servers (Mission Critical).
The first server, built on the mainframe server templates and including Intel Xeon E7 processors, was called PRIMEQUEST 1000 and was released in 2012.
At the beginning of this year, together with the announcement of the new Intel Xeon E7 v2 processors, PRIMEQUEST 2000 was released.



What is its main difference from the usual 4-8 socket servers from other manufacturers?


Another internal architecture is perfect, if to say briefly - this is a mainframe with x86 processors. That is, we get x86 processor performance along with new features, such as hardware partitions and the highest resiliency.
In traditional servers, as measures providing fault tolerance:
 ECC in the memory subsystem
 Two hot-swappable power supplies and fans
 Active PCI Slots
For comparison, the capabilities of PRIMEQUEST and the most common RISC \ UNIX machines are summarized in a table. Differences from traditional x86 servers are that hot-swappable motherboards are implemented and this implementation is made better than in modern RISC machines.



Hardware sections allow you to divide the server into several parts, just like modern ESX or Hyper-V virtualization tools do. The main differences are that the hardware-implemented scheme gives complete electrical independence between partitions, does not consume resources, and does not require licenses to create partitions. The software licensing scheme is also changing.
The capabilities of the hardware partitions are visible below:



Now we get fault tolerance implemented inside the server: some hardware problems inside the partition (motherboard output, processor, etc.) do not stop the server or other partitions. Extended partitions allow you to split a specific hardware partition into additional subsections with the allocation of resources with great accuracy.
It can also be used with classic software hypervisors, such as ESX from VMware or Hyper-V from Microsoft.

If a large-sized enterprise application is used that can work only on the x86 platform, for example, SAP HANA, then it gets the opportunity to increase fault tolerance at the hardware level.



Currently, dynamic reconfiguration capabilities are possible only in RedHat 7 version.

Support for hardware partitions can significantly save on software licenses. As an example, let's take the same Oracle Database as the most common software.

As a visual example, take the tpm OLTP Oracle 10g test results.
Power 750, POWER7 +, 4000 MHz, 2 MB SLC (8-core 80M TLC per DCM) - 3,995,403
PRIMEQUEST-2800E, Xeon E7-8890 v2, 2800 MHz, 3.75 MB SLC (15-core 37.5M TLC) - 9,400,000

Many will immediately notice that there are 120 cores in the PRIMEQUEST, and only 32 in the Power 750. Yes, this is true, and if we take the proportion to the machine’s Power, we will get about 48 cores to achieve the same performance. Accordingly, we compare the hardware section in 48 cores with 32 cores from IBM. But if we recall the licensing rules (remember that the hardware sections of PPAR are recognized by Oracle), the coefficient for Intel Xeon is 0.5, while for Power7 + it is 1. Then simple math:



Total difference is about $ 400,000. Plus, every year you need to pay a subscription of another 21% of the cost of licenses - another $ 264,000 of savings over 3 years.
Similar sections are supported by Oracle: if a regular x86 server needs to be licensed completely, then in the case of using PRIMEQUEST, only that section where Oracle is running. Also, in the event of an error on the hardware partition, the Oracle support service considers this an error on the physical machine. In the case of using software virtualization (ESX, Hyper-V, etc.), the Oracle support service will ask you to repeat this situation on a physical machine.
Intel has in its row and non-nuclear models, but with a high clock frequency. For example, E7-8893 v2.



Results 25% more than Oracle 10g OLTP - and if expressed in the language of money:
PRIMEQUEST-2800E- 48 core = rate 0.5 * 48 * 49,600 (price per core) = $ 1,190,400 - 25% = $ 297.5k

That is, the total savings can reach the amount of about 700 to $ upon initial purchase and more than $ 562k per subscription for 3 years ... I think a very convincing argument.

Today I tried to highlight the main features of PRIMEQUEST servers, although it can be continued for a long time: the modular design alone (the server uses PCI switches and the input / input subsystem is expanded by external PCI exp-box - up to 56 pieces) deserves close attention, although it is closely familiar with RISC servers will see an identical approach.

Summarizing, this server is currently unique on the market and is ready to significantly change the view on fault tolerance, functionality for x 86 platforms and the total cost of ownership.

Source: https://habr.com/ru/post/238857/


All Articles