Basics of HPC technology

Definition of high-loaded systems and methods for their construction

The load on the server is an important indicator of the use of server hardware. Hit is a client request to the server for information. Server load is defined as the ratio of the number of client requests (hits) to time, expressed in hits per second. According to Microsoft research in 2010, a high-loaded server can be considered a server with a load of 100-150 hits per second.
In the literature there are such concepts as HPC-system, high-load system, high-loaded cluster, Highload-system, supercomputer, which are sometimes used as synonyms. We will understand the site with a load of at least 150 hits per second.
A cluster is a group of computers that work together and make up a single unified computing resource. Each node is running its own copy of the operating system, which is most often used in Linux and BSD.
To understand how tasks performed by a cluster are distributed among its nodes, it is necessary to define scalability. Scalability - the ability of the system to cope with an increase in workload (to increase its productivity) when adding resources. A system is called scalable if it can increase performance in proportion to additional resources. Scalability can be estimated through the ratio of the increase in system performance to the increase in the resources used. The closer this ratio to the unit, the better. Also under the scalability refers to the possibility of increasing additional resources without structural changes to the central node of the system. Scaling the architecture of a high-load system can be horizontal and vertical. Vertical scaling is to increase system performance by increasing server capacity. The main disadvantage of vertical scaling is that it is limited to a certain limit. Iron parameters cannot be increased infinitely. However, in fact, the vertical component is almost always present, and universal horizontal scaling as such does not exist. Horizontal scaling is to increase system performance by connecting additional servers. It is the horizontal scaling that is now actually the standard. Also known is the term diagonal scaling. It implies the simultaneous use of two approaches.
Finally, it is necessary to define the basic principle used in the construction of any cluster architecture. This is the three-tier structure of the system (Fig. 1). The three links are frontend, backend, and data storage. Each link performs its functions, is responsible for different stages in the processing of requests and is scaled differently. Initially, the request comes to the frontend. The frontends are responsible, as a rule, for the return of static files, the initial processing of the request and the transfer of it further. The second link to which the request comes, already pre-processed by the front end, is the backend. Backend deals with calculations. On the side of the backend, as a rule, the business logic of the project is implemented. The next layer that enters into the processing of a request is the data warehouse, which is processed by the backend. This can be a database or file system. Three-tier cluster architecture

Overview of software and hardware for building a cluster HPC-system

When building a cluster, the problem arises of how to distribute the load between servers. For this purpose, load balancing is used, which besides the distribution itself performs a number of other tasks, for example: improving fault tolerance (if one of the servers fails, the system will continue to work) and protection against some types of attacks (for example, SYN-flood).

Front End Balancing and Protection

One of the DNS Round Robin balancing methods is used to scale the frontends. Its essence is that on the DNS server to create a domain of the system, several DNS records of type A are created. The DNS server issues these records in alternating cyclic order. In the simplest case, DNS Round Robin works by responding to requests not only with a single IP address, but with a list of several server addresses that provide the same service. With each answer, the sequence of ip-addresses changes. As a rule, simple clients try to establish connections with the first address from the list, so different clients will be given addresses of different servers, which distributes the total load between the servers. Any DNS server, for example bind, will be suitable for the implementation of the method. The disadvantage of this method is that there are DNS servers with some providers that force cache entries for a long time. DNS Round Robin

The next balancing method is balancing at the second level of the protocol stack. Balancing is performed using a router so that the frontends accept connections coming to the IP address of the system and respond to them, but do not respond to ARP requests related to this address. Of the software tools of this method, the most common is LVS (Linux Virtual Server), which is a Linux kernel module; this balancing method is also called Direct Routing. The main terminology here is as follows: Director - the actual node performing the routing; Realserver - server farm node; VIP or Virtual IP - just the IP of our virtual (collected from a heap of real) server; DIP and RIP - IP director and real servers. On the director, this same IPVS module (IP Virtual Server) is enabled, the rules for packet forwarding are configured and the VIP is raised - usually as an alias to the external interface. Users will walk through the VIP. Packages that come to the VIP are forwarded by the selected method to one of the Realservers and are already working out there normally. It seems to the client that he is working with one machine.
Another method is balancing at the third level of the protocol stack, that is, at the IP level. This method works in such a way that when a connection to the IP address of the system occurs, Destination NAT is done on the balancer, that is, the destination IP addresses are replaced by the forwardnd IP addresses in packets. For responses, packet headers are modified back. This is done using netfilter, which is part of the Linux kernel.

Since it is the frontends that receive requests from users, the main task of cluster protection falls on the frontends (or the frontend balancer, depending on the architecture). It is necessary to protect against all kinds of hacker attacks (for example, such as SYN flood and DDOS). A firewall is used mainly for protection (firewall is a fire wall), its other name is a firewall (brandmauer is a fire wall), another name is a firewall. A firewall blocks malicious traffic using packet filtering rules, and it can also perform actions such as caching, address translation, and forwarding on traffic. GNU / Linux has a built-in netfilter firewall, which is part of the Linux kernel.
')

Scaling backends

When building high-powered websites, they distinguish between light and heavy http requests. Light requests are requests for static web pages and images. Heavy requests are a call to a program that generates content dynamically. Dynamic web pages are generated by a program or script written in a high-level language: most often PHP, ASP.net, Perl, and Java. The combination of these programs is called business logic. Business logic is a set of rules, principles, dependencies of the behavior of domain objects, the implementation of the rules and restrictions of automated operations. Business logic is located on the backends. Two schemes are used: the first - the front-end web server handles light requests, and it proxies heavy backends; the second - the frontend acts purely as a proxy, but it proxies light requests to one group of servers, and heavy requests to another.
Apache is often used as a web server used on backends. Apache is the most popular HTTP server. Apache has a built-in virtual host mechanism. Apache provides various multiprocessor models (MPM) for use in various work environments. The prefork model — the most popular in Linux — creates a certain number of Apache processes when it starts up and manages them in a pool. An alternative model is a worker, which uses multiple threads instead of processes. Although threads are lighter than processes, they cannot be used until your entire server is safe for threads. And the prefork model has its own problems: each process takes a lot of memory. Highly loaded sites simultaneously process thousands of files, while being limited in memory and by the maximum number of threads or processes. In 2003, German developer Jan Kneshka became interested in this problem and decided that he could write a web server that would be faster than Apache, focusing on the right techniques. He designed the Lighttpd server as a single process with one thread and non-blocking I / O. To perform the scaling task, use Lighttpd + Apache, in such a way that Lighttpd can be sent to the client, and requests that end in, for example, .cgi and .php will be transmitted to Apache. Nginx is another popular server for solving scaling problems. Nginx is an HTTP server and reverse proxy server, as well as a mail proxy server. As a proxy server, Nginx is placed on the front end. There may be several backends, then Nginx works as a load balancer. This model allows you to save system resources, due to the fact that requests are accepted by Nginx, Nginx sends an Apache request and quickly receives a response, after which Apache frees up memory, and Nginx further interacts with the client (responds to simple requests), which is written to distribute static content , a large number of clients, with a small consumption of system resources. Under Microsoft Server, IIS is used as the backend web server, and business logic is written in ASP.net.
Another backend scaling tool is a scalable application server. Applicable if business logic is written in Java, namely on its server version. Similar applications are called servlets, and the server is called a servlet container or application server. There are many open source servlet containers: Apache Tomcat, Jetty, JBoss Application Server, GlassFish and proprietary: Oracle Application Server, Borland Application Server. Many application servers support clustering, provided that the application is designed and developed in accordance with well-defined levels. In addition, to solve critical problems with applications, the Oracle Application Server supports “cluster islands” - server sets at the J2EE level, at which session state parameters can be reproduced much easier, thereby providing a transparent redirection of the client’s request to another a component that can service this request if some J2EE component fails.

Scaling DBMS

Finally, in the description of the software used to create clustered HPC systems, it is necessary to mention the means of scaling data storage. General purpose databases are used as data stores for the web, the most common ones are MySQL and PostgreSQL.
The main scaling technique of a DBMS is sharding, or rather it would be more correct to call sharding not scaling, but splitting data into machines. The essence of the method is that as the amount of data increases, new shard servers are added, which are added when the existing shard is filled up to a certain limit.
When scaling a DBMS, the replication technique comes to the rescue. Replication is a means of communication between database servers. Using replication, you can transfer data from one server to another or duplicate data on two servers. Replication is used in the “virtual shard” scaling technique - with the help of replication, data is distributed so that each backend server works with its virtual shard, information about where the shard is physically located is stored in a table of relevance. Also, the replication technique in the scaling method is based on the peculiarities of database queries: rare update operations and frequent read requests. Each backend server works with its database server, they are called SLAVE, on these servers read operations from the table (SELECT function). If an entry is made to the table (INSERT and UPDATE functions), then the request goes to the MASTER server and from there replicates to all servers.
MySQL uses different data storage systems. Most of all this is MyISAM and InnoDB. There is also an NDB storage system, which is used in a special MySQL scaling tool called MySQL cluster. The MySQL Cluster cluster part is currently configured independently of MySQL servers. In MySQL Cluster, each part of the cluster is called a node (node), while the nodes are actually processes. There may be any number of nodes on one computer. In the minimal configuration of a MySQL cluster, there will be at least three nodes: the manager (MGM node) - its role: manage other nodes within MySQL Cluster, such as providing configuration data, starting and stopping nodes, performing backups, etc .; database node (DB node) - manages and stores the database directly, there are as many DB nodes as there are fragments for replications, for example, with two replications of two fragments each, you need four DB nodes; client node (API) - a user node that accesses a cluster; in the case of a MySQL cluster, a user node is a traditional MySQL server that uses the NDB Cluster storage type, allowing access to the clustered tables.

Distributed computing as an alternative solution

Sometimes, instead of building your own highly loaded system based on the cluster architecture, it is easier and more profitable for the client to use the Internet services of distributed computing. Distributed computing is a way to solve time-consuming computational problems using several computers, most often combined into a parallel computing system. The history of distributed computing dates back to 1999, when the freshman at Northeastern United States University, Sean Fanning, wrote a system for exchanging MP3 files between users. This project is called Napster. Following the example of Napster, a whole class of P2P (or peer-to-peer) networks of a new, decentralized type has evolved, P2P file sharing is that the user downloads files not from the server, but from computers of other users of the file-sharing network, whose IP addresses he receives from a specialized server, called a tracker or hub. Files are downloaded simultaneously from all peers (participants of the peer-to-peer network) and are accompanied by simultaneous feedback, so the peering network is a kind of distributed file storage.
The technology of distributed computing has evolved, and the principles of P2P were used not only to create distributed file storages, distributed databases, streams, processors appeared.
Grid computing (grid - grid, network) is a form of distributed computing, in which the “virtual supercomputer” is represented as clusters connected by a network, working together to perform a huge number of tasks. A grid is a system that coordinates distributed resources through standard, open, universal protocols and interfaces to ensure non-trivial quality of service. The main idea embodied in the concept of grid computing is the centralized remote provision of resources necessary for solving various kinds of computational problems. The user can start any task from any computer for computing, resources for this calculation should be automatically provided on remote high-performance servers, regardless of the type of task. The distribution of resources in which grid developers are interested is not file sharing, but direct access to computers, software, data and other resources that are required for jointly solving tasks and strategies for managing resources. The following levels of the grid architecture are distinguished: basic (contains various resources, such as computers, storage devices, networks, sensors, etc.); binding (defines communication protocols and authentication protocols); resource (implements the protocols of interaction with the resources of the RVS and their management); collective (resource catalog management, diagnostics, monitoring); applied (tools for working with grids and user applications).

Grid structure

The next step in the evolution of distributed computing was cloud computing. : « — , , , ». — , -.
, ( ) . , , , .
, :

(Infrastructure as a Service: IaaS)
(Platform as a Service: PaaS)
(Software as a Service: SaaS)

SaaS web, web-. SaaS , , : PHP, Java . NET. , , - -. Gmail , , . Gmail , , SaaS.
PaaS - web-. , Google App Engine, Python, API Google.
Iaas , , . “” , IaaS .
2012 : Amazon Web Services, Windows Azure Google App Engine.
Amazon Web Services (AWS) — Amazon , . Amazon Elastic Cloud Compute (Amazon EC2) — «» Amazon. API web- , Amazon. EC2 , . amazon EC2 (API) web-. Amazon , API, : web- Amazon (Amazon Web Services Console), Firefox ElasticFox, Amazon (Amazon Command Line tools).
Windows Azure — , Microsoft. Windows Azure , Microsoft. , — , , Windows Azure . Windows Azure Windows Azure AppFabric, , Windows Azure. Windows Azure PaaS, . . - ( AppFabric).
, HPC- -: HPC-

— «: » — №7-11 2012.
C HighLoad++ 2010 2011 [ ] — , 2012.
.. — : , 2012. — 182.
. : . — .: -, 2011. -288.: .
. . Microsoft Windows Azure — .: «», 2013 — 234 ., .

Source: https://habr.com/ru/post/172567/

All Articles