Relational algebra is based on set theory and is the basis of the logic of database operation.
When I first studied the database and SQL device, a preliminary acquaintance with relational algebra greatly helped further knowledge to properly fit in my head, and I will try to make this article have a similar effect.
So if you are going to start your training in this area or you just became interested, I ask under the cat.
Relational database
')
To begin with, we introduce the concept of a relational database in which we will perform all the actions.
A relational database is a set of relations containing all the information that should be stored in the database. In this definition, we are interested in the term relation, but for the time being we will leave it without a strict definition.
Better imagine a table of products.
PRODUCTS table
ID | NAME | COMPANY | PRICE |
123 | Cookies | LLC āThe Dark Sideā | 190 |
156 | Tea | LLC āThe Dark Sideā | 60 |
235 | Pineapples | JSC āFruitsā | 100 |
623 | Tomatoes | LLC āVegetablesā | 130 |
A table consists of 4 rows, a row in a table is a tuple in relational theory. The set of ordered tuples is called a relation.
Before we define the relationship, we introduce another term - the domain. Domains in relation to a table are columns.
For clarity, we now introduce a strict definition of a relationship.
Let N be the sets D1, D2, .... Dn (domains), the relation R over these sets is the set of ordered N-tuples of the form <d1, d1, ... dn>, where d1 belongs to D1 and so on. The sets D1, D2, .. Dn are called the domains of the relation R.
Each element of a tuple is the value of one of the attributes corresponding to one of the domains.
Keys in a relationship
In relation to the requirement is that all tuples must differ. To uniquely identify a tuple, there is a primary key. The primary key is an attribute or a set of the minimum number of attributes that uniquely identifies a particular tuple and does not contain additional attributes.
It is understood that all attributes in the primary key must be necessary and sufficient to identify a specific tuple, and the exclusion of any of the attributes in the key will make it insufficient for identification.
For example, in such a table, the key will be a combination of attributes from the first and second columns.
DRIVERS table
COMPANY | DRIVER |
LLC āThe Dark Sideā | Vladimir |
LLC āThe Dark Sideā | Michael |
JSC āFruitsā | Ruslan |
LLC āVegetablesā | Vladimir |
It can be seen that there can be several drivers in an organization, and in order to unambiguously identify a driver, it is necessary to have a value from the column āName of organizationā and from āDriverās nameā. Such a key is called composite.
In a relational database, tables are interrelated and correlated with each other as main and subordinate. The connection of the main and subordinate tables is carried out through the primary key (primary key) of the main table and the foreign key (foreign key) of the subordinate table.
A foreign key is an attribute or set of attributes that is the primary key in the main table.
This preparatory theory will be enough to become familiar with the basic operations of relational algebra.
Relational algebra operations
The basic eight operations of relational algebra were proposed by
E. Codd .
- Union
- Intersection
- Subtraction
- Cartesian product
- Sample
- Projection
- Compound
- Division
The first half of operations is similar to the same operations on sets. Some operations can be expressed through other operations. Consider most of the operations with examples.
For understanding, it is important to remember that the result of any operation of algebra on relations is another relation, which can then also be used in other operations.
Create another table, which is useful to us in the examples.
table SELLERS
ID | SELLER |
123 | OOO āDartā |
156 | OJSC āVedroā |
235 | CJSC āVegetable Baseā |
623 | OJSC āFirmā |
We agree that in this table ID is the foreign key associated with the primary key of the PRODUCTS table.
To begin, consider the simplest operation - the name of the relationship. Its result will be the same relationship, that is, performing the PRODUCTS operation, we will receive a copy of the PRODUCTS relationship.
Projection
A projection is an operation in which attributes are selected from a relationship only from specified domains, that is, only necessary columns are selected from a table, and if there are several identical tuples, then only one instance of such a tuple remains in the resulting ratio.
For example, let's make a projection on the PRODUCTS table by selecting from it ID and PRICE.
Operation syntax:
Ļ (ID, PRICE) PRODUCTS
As a result of this operation, we obtain the relation:
ID | PRICE |
123 | 190 |
156 | 60 |
235 | 100 |
623 | 130 |
Sample
A sample is an operation that selects a set of rows in a table that satisfy specified conditions. The condition can be any logical expression.
For example, let's make a selection from a table with a price greater than 90.
Operation syntax:
Ļ (PRICE> 90) PRODUCTS
ID | NAME | COMPANY | PRICE |
123 | Cookies | LLC āThe Dark Sideā | 190 |
235 | Pineapples | JSC āFruitsā | 100 |
623 | Tomatoes | LLC āVegetablesā | 130 |
In the sample condition, we can use any logical expression. Let's make another sample with a price of more than 90 and product ID less than 300:
Ļ (PRICE> 90 ^ ID <300) PRODUCTS
ID | NAME | COMPANY | PRICE |
123 | Cookies | LLC āThe Dark Sideā | 190 |
235 | Pineapples | JSC āFruitsā | 100 |
Let's combine the projection and sampling operators. We can do this because any of the operators returns a relation as a result and also uses the relation as arguments.
From the table with products we will select all companies selling products that are cheaper than 110.
Ļ COMPANY Ļ (PRICE <100) PRODUCTS
COMPANY |
LLC āThe Dark Sideā |
JSC āFruitsā |
Multiplication
Multiplication or Cartesian product is an operation performed on two relations, as a result of which we obtain a relationship with all domains from two initial relations. Tuples in these domains will be all possible combinations of tuples from the initial relationship. The example will be clearer.
We obtain the Cartesian product of the PRODUCTS and SELLERS tables.
Operation syntax:
PRODUCTS Ć SELLERS
You may notice that these two tables have the same domain ID. In this situation, domains with the same name get a prefix in the form of the name of the corresponding relationship, as shown below.
For brevity, we multiply not complete relations, but samples with the condition ID <235
(the same tuples are highlighted)
PRODUCTS.ID | NAME | COMPANY | PRICE | SELLERS.ID | SELLER |
123 | Cookies | LLC āThe Dark Sideā | 190 | 123 | OOO āDartā |
156 | Tea | LLC āThe Dark Sideā | 60 | 156 | OJSC āVedroā |
123 | Cookies | LLC āThe Dark Sideā | 190 | 156 | OJSC āVedroā |
156 | Tea | LLC āThe Dark Sideā | 60 | 123 | OOO āDartā |
For an example of using this operation, let us imagine the need to choose sellers with prices less than 90. Without a product, you would first need to get product IDs from the first table, then use the second table to get the necessary SELLER names, and using the product you will receive such a query:
Ļ (SELLER) Ļ (RODUCTS.ID = SELLERS.ID ^ PRICE <90) PRODUCTS Ć SELLERS
As a result of this operation, we obtain the relation:
Compound and natural compound
The join operation is the reverse of the projection operation and creates a new relation from the two already existing ones. A new relation is obtained by concatenating the tuples of the first and second relations, and the concatenations are subjected to relations in which the values āāof the given attributes match. In particular, if you combine the PRODUCTS and SELLERS relationships, these attributes will be the attributes of the ID domains.
Also, for clarity, you can imagine the connection as the result of two operations. First, the product of the original tables is taken, and then from the obtained relation, we make a selection with the condition of equality of attributes from the same domains. In this case, the condition is equality of PRODUCTS.ID and SELLERS.ID.
Let's try to connect the relationship PRODUCTS and SELLERS and get the relationship.
PRODUCTS.ID | NAME | COMPANY | PRICE | SELLERS.ID | SELLER |
123 | Cookies | LLC āThe Dark Sideā | 190 | 123 | OOO āDartā |
156 | Tea | LLC āThe Dark Sideā | 60 | 156 | OJSC āVedroā |
235 | Pineapples | JSC āFruitsā | 100 | 235 | CJSC āVegetable Baseā |
623 | Tomatoes | LLC āVegetablesā | 130 | 623 | OJSC āFirmā |
A natural connection gets a similar relationship, but if we have a correctly configured scheme in the database (in this case, the primary key of the PRODUCTS ID table is associated with the foreign key of the SELLERS ID table), then in the resulting relationship there is only one domain ID.
Operation syntax:
PRODUCTS ā SELLERS;
It turns out this attitude:
PRODUCTS.ID | NAME | COMPANY | PRICE | SELLER |
123 | Cookies | LLC āThe Dark Sideā | 190 | OOO āDartā |
156 | Tea | LLC āThe Dark Sideā | 60 | OJSC āVedroā |
235 | Pineapples | JSC āFruitsā | 100 | CJSC āVegetable Baseā |
623 | Tomatoes | LLC āVegetablesā | 130 | OJSC āFirmā |
Intersection and subtraction.
The result of the intersection will be a relation consisting of tuples that are fully included in both relations.
The result of the subtraction will be a relation consisting of tuples that are tuples of the first relation and are not tuples of the second relation.
These operations are similar to the same operations on sets, so I think there is no need to describe them in detail.
Information sources
- Basics of using and designing databases - V. M. Ilyushechkin
- lecture course Introduction to Databases - Jennifer Widom, Stanford University
I will be grateful for the reasoned comments