Lectures Technopark. Term 3 Design of high-load systems

And once again our regular “Lectures of Technopark” section is on air. At this time, we invite you to familiarize yourself with the materials of the course “Design of High-Load Systems” The goal of the course is to provide students with the skills to design high-performance software systems.

Lecture 1. Introduction to Highload

At the beginning of the lecture, a definition is given - what can be considered a highly loaded system and in what units the load is measured. The features of such systems are explained, in brief, the Slashdot effect is mentioned. Criteria for high site availability in terms of downtime per month and year are given. It describes various web server architectures, a typical web site device, LAMP technology. The following describes the methods for connecting dynamic content: CGI, FastCGI, UWSGI, mod_perl, mod_php, self-written modules, node.js, content_by_lua. The concept of blocking operations and methods of non-blocking processing are considered.

')

Lecture 2. Network subsystem

The lecture begins with an explanation of what factors affect the system bandwidth: network delays, speed of light and distance between the DC, TCP-handshake, packetloss and TCP-retransmit. Explains how to identify bottlenecks in terms of bandwidth. The concept of Looking Glass is considered as one of the tools for diagnosing bandwidth problems. Further, the OSI model is applied to TCP / IP, and routing nuances. Then it tells about possible network problems and various ways to solve them (UDP, multicast, Jumbo-frames, socket per process, multi-threaded network cards). A substantial part of the lecture is devoted to optimizing the TCP / IP stack for high loads.

Lecture 3. HTTP protocol and web optimization

Lists some web optimization rules. The features of browsers used to optimize page load time are discussed. The issues of gzip data compression, reducing the number of requests, minimizing the number of queries to the DNS, as well as script files and CSS static caching are affected. Explains how to analyze information obtained with Conditional GET. After that, it tells about the possibilities for optimizing redirects, about CSS sprites. Then information is given on keep-alive, chunked, on the proper use of cookies. The advantages of several connections to the domain, the removal of long requests to AJAX or iframe are explained.

Lecture 4. Load scaling

First, the definition of the scaling of the load and its types (vertical and horizontal). Further details about load balancing algorithms (random, round-robin, weighted round-robin, least connections, least response time, load-based). Balancing tools are considered: Round-Robin DNS, xixi DNS, L4-balancers (Cisco CSS, LVS), L7-balancers (Cisco ACE, LVS, nginx).

Lecture 5. RAM

Understands the hardware configuration of a typical server, explains the physical memory device and the reasons for the decline in overall system performance. It describes how caching is organized at the hardware level, as well as practical ways to speed up server memory (sequential read with a margin, read without hops between rows, prefetching).

Lecture 6. Databases and disk subsystem

First, it tells about the development of hard drives and the current state in terms of performance with linear, random and competitive access. The features, advantages and disadvantages of different types of disk arrays, including software, are compared. Then Ext4 and XFS file systems are considered. Mentioned is the third level of hard drive virtualization - LVM (Logical volume manager). The second part of the lecture is devoted to databases. First, the advantages and disadvantages of MySQL and PostgreSQL are disclosed in detail. The structure of expenses for the execution of the request, as well as the planning of the request itself, is analyzed. It also describes how to accelerate systems built on databases: tuning, replication, sharding, minimizing network latency, NoSQL, writing a specialized database.

Lecture 7. Typical architectural solutions

The beginning of the lecture explains the difference between the frontend and backend servers; it considers the creation of specialized server groups by load type (by function, by importance, by stability, by shards). Lists the criteria of complexity and reliability of various architectural solutions, gives advice on the choice of components, technologies and programming languages. The following describes how to optimize (replacing equipment, using a different algorithm, correspondence code, paralleling tasks on different servers, etc.). After that, ways of handling errors when executing queries, methods of caching data to reduce the load during peak hours are discussed. Then it tells about the recording and processing of logs, monitoring the load and operation of both the entire system and its components.

Previous issues:

Subscribe to Technopark's youtube channel !

Source: https://habr.com/ru/post/254843/

All Articles