Teradata - DBMS parallel from birth

Welcome, dear Habravchane. Recently, on Habré, the name of the company Teradata began to flicker in certain issues. And, having seen the possible interest, we decided to tell a little about what the Teradata DBMS is, in the first person. We plan to prepare a small series of articles on the most interesting, in our opinion, technical features of the DBMS and work with it. If you have experience with Teradata or your company uses our platform and you have questions, drop them and we will either answer them in the comments or prepare an appropriate full-fledged article. Let's start with a small review. For dating, so to speak.

Parallelism

DBMS Teradata was created in 1976 and many of the concepts laid down in its foundation are still relevant today. Teradata implemented a design that was different from the mainframe architecture that was dominant at the time. It was the idea of networking computer modules working in parallel. A pleasant addition to this architecture was a significant reduction in the cost of systems.
The idea of parallel processing gave Teradata the ability to linearly scale performance, the volume of data being processed and the number of users. What is "parallel processing"?
Two friends at 10 o'clock on the evening of Saturday dined and drank beer, when one of them said that he had a business and he had to leave. When a friend asked what the cause was, he replied: “Wash”. A friend was indignant about this: “How can it be ... to leave in the evening ... to do laundry? ..” I received an explanation: “If I go to do laundry tomorrow, then I’ll be lucky if one of 10 machines in the launderette is free. I need to wash my laundry for 10 loads, and it will take all day. If I go now, when everyone is resting, I can use all the washing machines and finish washing in an hour and a half. ”
This story describes what is called parallel processing. Teradata downloads, archives and processes data in parallel. Depending on the configuration, Teradata allows you to perform hundreds, thousands of operations simultaneously.

Logical representation of the Teradata architecture

The choir leader was preparing for the concert. Suddenly he stopped and addressed the choir with the words “Did I tell you that a few years ago I was in charge of another choir, we were working on the same work and they made the same mistake that you are making now? Do you know what this mistake is? ”The voice from the choir answered:“ One and the same leader! ”
Many storage building projects are implemented on architectures that are not designed for this. Many companies are often surprised at the failures of such projects, although they initially had no chance of success. The management of companies needs an understanding that will allow them to choose those decision-support technologies that will fit the scale of the business.
The diagram below is a logical Teradata architecture.

')
The main purpose of the components presented in the diagram:
PE - Parsing Engine, is responsible for controlling the session and handling user requests;
AMP - Access Module Processor, is responsible for extracting data from its associated disk;
BYNET - The messaging environment between system components.
Teradata belongs to a class of systems called MPP (Massively parallel processing), and has a Shared nothing architecture, within which individual nodes of the system have no common resources. And the DBMS itself is relational.
At the top level, the process for executing a query in Teradata is as follows:

A user connecting to the Teradata system establishes a connection to the Parsing Engine (PE). After the connection is established, it can execute SQL queries.
PE checks the syntax of the received SQL query, the user's access rights to information, and creates a query execution plan for execution by the Access Module Processor (AMP).
PE via BYNET sends the plan's steps to the AMPs for execution.
AMPs extract the necessary information from their associated disks and return it to PE via BYNET. AMPs do their work in parallel.
PE returns the result to the user.

PE and AMP are combined by the term VPROC, or virtual processor.

In order for requests to be processed as efficiently as possible, the data must be distributed as evenly as possible between AMPs. There are exceptions, but we will tell about them in one of the following articles. The data is distributed on the basis of its own hash-algorithm for processing the fields included in the Primary Index table. Moreover, the hashing algorithm is implemented in such a way that, based on the hash value, you can understand the number of the AMP on which the string is located. In addition to the main goal, it also speeds up searches by value.

Technical details on the composition of hash

In modern versions of the Teradata platforms, the hash result is a 32-bit sequence. It contains AMP identifiers (a unit of parallelism of the system) and the string itself. Suppose we need to process an entry with a value of 7202. The scheme for determining its hash and which AMP to put a line on is as follows.

Thus, AMPs are workhorses that are responsible for their piece of data. But there are often tasks when a data redistribution is required to fulfill a request. An example is join two tables that are distributed over different keys. Then BYNET comes into play. BYNET is responsible for communication between system components, including data transfer between AMPs. It can transfer arrays of data between nodes with unbelievable speed. It is not difficult to guess what load falls on BYNET. Therefore, the founding fathers of technology laid a good potential in it and made their know-how. On Warehouse Appliance platforms, this component can be implemented software over Ethernet. On more serious platforms, the Active Enterprise Data Warehouse is a separate hardware module, since Ethernet is no longer effective with such volumes.
Facing customers is a PE - parsing engine. This is a component (or rather, a set of components), whose functions include the initial processing of incoming requests, execution control and sending responses. It includes 3 modules: parser, optimizer, dispatcher.
As you might guess, the parser is responsible for parsing the request, checking the access rights and other semantic control. The optimizer evaluates possible options for executing a query and selects the most efficient one. Rumor has it that the optimizer code takes up more than 1 million lines, and somewhere in the Rancho Bernardo area there is a group of people who are not engaged in anything except improving the logic of the optimizer from version to version. And yes, it is impossible not to say that there are no hints in Teradata. Therefore, if a developer believes that his plan is better than the one that the optimizer has determined, then this will have to be proved, and not just put a “gun” to the port and say: “ + use_hash ”. And, as practice shows, the optimizer is right.
And the dispatcher is responsible for breaking the chosen plan into steps and controlling the sequence / parallelism of their execution, giving instructions to AMPs through BYNET.

So if in brief, the logical operation of the system looks like this. In the following posts, we thought to talk about the features of working with statistics in a DBMS, maybe to touch on the issues of physical modeling and talk about tools for automatic load balancing. What do you think about possible interesting topics?

Source: https://habr.com/ru/post/160821/

All Articles

Teradata - DBMS parallel from birth

Parallelism

Logical representation of the Teradata architecture

More articles: