📜 ⬆️ ⬇️

SQL Language Tutorial (DDL, DML) on the example of MS SQL Server dialect. Part four

Previous parts




In this part we will look at


Multi-query queries:

Working with subqueries:


Add some new data


For demonstration purposes, add a few departments and positions:

SET IDENTITY_INSERT Departments ON INSERT Departments(ID,Name) VALUES(4,N'  ') INSERT Departments(ID,Name) VALUES(5,N'') SET IDENTITY_INSERT Departments OFF 

')
 SET IDENTITY_INSERT Positions ON INSERT Positions(ID,Name) VALUES(5,N'') INSERT Positions(ID,Name) VALUES(6,N'') INSERT Positions(ID,Name) VALUES(7,N'') SET IDENTITY_INSERT Positions OFF 


JOIN joins - horizontal data join operations


Here we will greatly benefit from knowledge of the database structure, i.e. what tables are there in it, what data is stored in these tables and which fields of the table are related to each other. First of all, always thoroughly study the database structure, since A normal query can be written only when you know where it is coming from. Our structure consists of 3 tables Employees, Departments and Positions. I will give here a diagram from the first part:



If the essence of the RDB is divide and conquer, then the essence of the operations of unions is again to glue together the data, divided into tables. bring them back to the human form.

Simply put, the operations of horizontal joining the table with other tables are used to get the missing data from them. Recall the example of the weekly report for the director when, when querying from the Employees table, we lacked the “Department Name” field in the Departments table to get the final result.

Let's start with the theory. There are five types of connections:
  1. JOIN - left_table JOIN right_table ON conditions_connection
  2. LEFT JOIN - left_table LEFT JOIN right_table ON conditions_connection
  3. RIGHT JOIN - left_table RIGHT JOIN right_table ON conditions_connection
  4. FULL JOIN - left_table FULL JOIN right_table ON conditions_connection
  5. CROSS JOIN - left_table CROSS JOIN right_table_

Short syntaxFull syntaxDescription (It’s not always clear to everyone. So, if it’s not clear, just return here after considering the examples.)
JOININNER JOINFrom the rows of the left table and the right table, only those rows are returned, according to which join conditions are fulfilled.
LEFT JOINLEFT OUTER JOINAll rows of the left_table are returned (keyword LEFT). Only those rows of the left_table that satisfy the conditions of joining are added to the data of the right_table. For missing data, NULL values ​​are inserted instead of rows in the right_table.
RIGHT JOINRIGHT OUTER JOINReturns all rows in right_table (keyword RIGHT). Only those rows of the right_table that satisfy the conditions of joining are added to the data of the left_table. For missing data, NULL values ​​are inserted instead of the left-table rows.
FULL JOINFULL OUTER JOINAll rows from left_table and right_table are returned. If join_line conditions are met for the left_table and right_table, then they are combined into a single line. For rows that do not have join_conditions, NULL values ​​are inserted in place of the left table or in the place of right table, depending on the data of which table in the row is not available.
CROSS JOIN-Combining each row of the left_table with all rows of the right_table. This type of compound is sometimes called Cartesian product.

As the table shows, the full syntax from the short one differs only in the presence of the words INNER or OUTER.

Personally, I always use short syntax when writing queries, for the reason:
  1. This is shorter and does not litter the query with unnecessary words;
  2. According to LEFT, RIGHT, FULL and CROSS, and so it’s clear what kind of mix it is, just as in the case of JOIN;
  3. I consider the words INNER and OUTER in this case to be unnecessary rudiments, which more confuse beginners.

But of course, this is my personal preference, maybe someone likes to write long, and he sees this as his charms.

Understanding each type of compound is very important, because from the use of one or another type, the result of the query may differ. Compare the results of the same query using a different type of connection, just try to see the difference for now and move on (we'll come back here):

 -- JOIN  5  SELECT emp.ID,emp.Name,emp.DepartmentID,dep.ID,dep.Name FROM Employees emp JOIN Departments dep ON emp.DepartmentID=dep.ID 

IDNameDepartmentIDIDName
1000Ivanov I.I.oneoneAdministration
1001Petrov P.P.33IT
1002Sidorov S.S.22Accounting
1003Andreev A.A.33IT
1004Nikolaev N.N.33IT

 -- LEFT JOIN  6  SELECT emp.ID,emp.Name,emp.DepartmentID,dep.ID,dep.Name FROM Employees emp LEFT JOIN Departments dep ON emp.DepartmentID=dep.ID 

IDNameDepartmentIDIDName
1000Ivanov I.I.oneoneAdministration
1001Petrov P.P.33IT
1002Sidorov S.S.22Accounting
1003Andreev A.A.33IT
1004Nikolaev N.N.33IT
1005Aleksandrov A.A.NullNullNull

 -- RIGHT JOIN  7  SELECT emp.ID,emp.Name,emp.DepartmentID,dep.ID,dep.Name FROM Employees emp RIGHT JOIN Departments dep ON emp.DepartmentID=dep.ID 

IDNameDepartmentIDIDName
1000Ivanov I.I.oneoneAdministration
1002Sidorov S.S.22Accounting
1001Petrov P.P.33IT
1003Andreev A.A.33IT
1004Nikolaev N.N.33IT
NullNullNullfourMarketing and advertising
NullNullNullfiveLogistics

 -- FULL JOIN  8  SELECT emp.ID,emp.Name,emp.DepartmentID,dep.ID,dep.Name FROM Employees emp FULL JOIN Departments dep ON emp.DepartmentID=dep.ID 

IDNameDepartmentIDIDName
1000Ivanov I.I.oneoneAdministration
1001Petrov P.P.33IT
1002Sidorov S.S.22Accounting
1003Andreev A.A.33IT
1004Nikolaev N.N.33IT
1005Aleksandrov A.A.NullNullNull
NullNullNullfourMarketing and advertising
NullNullNullfiveLogistics

 -- CROSS JOIN  30  - (6   Employees) * (5   Departments) SELECT emp.ID,emp.Name,emp.DepartmentID,dep.ID,dep.Name FROM Employees emp CROSS JOIN Departments dep 

IDNameDepartmentIDIDName
1000Ivanov I.I.oneoneAdministration
1001Petrov P.P.3oneAdministration
1002Sidorov S.S.2oneAdministration
1003Andreev A.A.3oneAdministration
1004Nikolaev N.N.3oneAdministration
1005Aleksandrov A.A.NulloneAdministration
1000Ivanov I.I.one2Accounting
1001Petrov P.P.32Accounting
1002Sidorov S.S.22Accounting
1003Andreev A.A.32Accounting
1004Nikolaev N.N.32Accounting
1005Aleksandrov A.A.Null2Accounting
1000Ivanov I.I.one3IT
1001Petrov P.P.33IT
1002Sidorov S.S.23IT
1003Andreev A.A.33IT
1004Nikolaev N.N.33IT
1005Aleksandrov A.A.Null3IT
1000Ivanov I.I.onefourMarketing and advertising
1001Petrov P.P.3fourMarketing and advertising
1002Sidorov S.S.2fourMarketing and advertising
1003Andreev A.A.3fourMarketing and advertising
1004Nikolaev N.N.3fourMarketing and advertising
1005Aleksandrov A.A.NullfourMarketing and advertising
1000Ivanov I.I.onefiveLogistics
1001Petrov P.P.3fiveLogistics
1002Sidorov S.S.2fiveLogistics
1003Andreev A.A.3fiveLogistics
1004Nikolaev N.N.3fiveLogistics
1005Aleksandrov A.A.NullfiveLogistics


It’s time to think about table aliases


The time has come to recall the pseudonyms of the tables, which I described at the beginning of the second part.

In multi-query queries, the pseudonym helps us explicitly specify from which table the field is taken. Let's look at an example:

 SELECT emp.ID,emp.Name,emp.DepartmentID,dep.ID,dep.Name FROM Employees emp JOIN Departments dep ON emp.DepartmentID=dep.ID 


In it, the fields with the names ID and Name are in both tables in both Employees and Departments. And in order to distinguish them, we preface the field name with an alias and a period, i.e. "Emp.ID", "emp.Name", "dep.ID", "dep.Name".

We recall why it is more convenient to use short pseudonyms - because, without pseudonyms, our query would look like this:

 SELECT Employees.ID,Employees.Name,Employees.DepartmentID,Departments.ID,Departments.Name FROM Employees JOIN Departments ON Employees.DepartmentID=Departments.ID 


For me, it became very long and worse to read, because field names are visually lost among duplicate table names.

In multi-query queries, although you can specify a name without a pseudonym, if the name is not duplicated in the second table, I would recommend always using pseudonyms when connecting, because nobody guarantees that a field with the same name will not be added to the second table over time, and then your query will simply break, swearing that it cannot understand to which table this field belongs.

Only by using pseudonyms can we make the connection of the table with ourselves. Suppose there is a task to get for each employee, the data of the employee who was accepted right before him (the personnel number differs by one less). Suppose that our personnel numbers are issued sequentially and without holes, then we can do it in the following way:

 SELECT e1.ID EmpID1, e1.Name EmpName1, e2.ID EmpID2, e2.Name EmpName2 FROM Employees e1 LEFT JOIN Employees e2 ON e1.ID=e2.ID+1 --     


Those. here is one table Employees, we gave the alias "e1", and the second "e2".

Parse each type of horizontal connection.


For this purpose, we consider 2 small abstract tables, which we call LeftTable and RightTable:

 CREATE TABLE LeftTable( LCode int, LDescr varchar(10) ) GO CREATE TABLE RightTable( RCode int, RDescr varchar(10) ) GO INSERT LeftTable(LCode,LDescr)VALUES (1,'L-1'), (2,'L-2'), (3,'L-3'), (5,'L-5') INSERT RightTable(RCode,RDescr)VALUES (2,'B-2'), (3,'B-3'), (4,'B-4') 


Let's see what's in these tables:

 SELECT * FROM LeftTable 

LCodeLdescr
oneL-1
2L-2
3L-3
fiveL-5

 SELECT * FROM RightTable 

RCodeRDescr
2B-2
3B-3
fourB-4


JOIN


 SELECT l.*,r.* FROM LeftTable l JOIN RightTable r ON l.LCode=r.RCode 

LCodeLdescrRCodeRDescr
2L-22B-2
3L-33B-3

Here string returns were returned for which the condition was met (l.LCode = r.RCode)



LEFT JOIN


 SELECT l.*,r.* FROM LeftTable l LEFT JOIN RightTable r ON l.LCode=r.RCode 

LCodeLdescrRCodeRDescr
oneL-1NullNull
2L-22B-2
3L-33B-3
fiveL-5NullNull

Here, all LeftTable strings were returned, which were supplemented with data from the RightTable strings, for which the condition was satisfied (l.LCode = r.RCode)



RIGHT JOIN


 SELECT l.*,r.* FROM LeftTable l RIGHT JOIN RightTable r ON l.LCode=r.RCode 

LCodeLdescrRCodeRDescr
2L-22B-2
3L-33B-3
NullNullfourB-4

Here all the RightTable strings were returned, which were supplemented with the data strings from the LeftTable, for which the condition was met (l.LCode = r.RCode)



In fact, if we rearrange LeftTable and RightTable in some places, we will get a similar result using the left connection:

 SELECT l.*,r.* FROM RightTable r LEFT JOIN LeftTable l ON l.LCode=r.RCode 

LCodeLdescrRCodeRDescr
2L-22B-2
3L-33B-3
NullNullfourB-4

I noticed with myself that I use LEFT JOIN more often, i.e. I first think which table data is important to me, and then I think which table / tables will play the role of a complementary table.

FULL JOIN is essentially a simultaneous LEFT JOIN + RIGHT JOIN


 SELECT l.*,r.* FROM LeftTable l FULL JOIN RightTable r ON l.LCode=r.RCode 

LCodeLdescrRCodeRDescr
oneL-1NullNull
2L-22B-2
3L-33B-3
fiveL-5NullNull
NullNullfourB-4

All rows from LeftTable and RightTable are returned. The strings for which the condition was met (l.LCode = r.RCode) were combined into one string. Missing in the line data on the left or right side are filled with NULL-values.



CROSS JOIN


 SELECT l.*,r.* FROM LeftTable l CROSS JOIN RightTable r 

LCodeLdescrRCodeRDescr
oneL-12B-2
2L-22B-2
3L-32B-2
fiveL-52B-2
oneL-13B-3
2L-23B-3
3L-33B-3
fiveL-53B-3
oneL-1fourB-4
2L-2fourB-4
3L-3fourB-4
fiveL-5fourB-4

Each LeftTable line connects to the data of all RightTable lines.



Go back to the Employees and Departments tables.


I hope you understand the principle of horizontal joints. If this is the case, then return to the beginning of the “JOIN-connections - horizontal data-joining operations” section and try to understand the examples yourself by combining the Employees and Departments tables, and then come back again, discuss it together.

Let's try to summarize together for each request:
RequestSummary
 -- JOIN  5  SELECT emp.ID,emp.Name,emp.DepartmentID,dep.ID,dep.Name FROM Employees emp JOIN Departments dep ON emp.DepartmentID=dep.ID 
In essence, this query will return only employees who have the DepartmentID value.
Those. we can use this connection in the case when we need data on employees belonging to some department (excluding external correspondents).
 -- LEFT JOIN  6  SELECT emp.ID,emp.Name,emp.DepartmentID,dep.ID,dep.Name FROM Employees emp LEFT JOIN Departments dep ON emp.DepartmentID=dep.ID 
Will return all employees. For those employees who do not have the DepartmentID specified, the “dep.ID” and “dep.Name” fields will contain NULL.
Remember that NULL values ​​can be processed, for example, with the help of ISNULL (dep. Name, 'out of state'), for example.
This type of connection can be used when it is important for us to obtain data on all employees, for example, to get a list for the calculation of the salary.
 -- RIGHT JOIN  7  SELECT emp.ID,emp.Name,emp.DepartmentID,dep.ID,dep.Name FROM Employees emp RIGHT JOIN Departments dep ON emp.DepartmentID=dep.ID 
Here we got holes on the left, i.e. there is a department, but there are no employees in this department.
Such a connection can be used, for example, when you need to find out which departments are occupied and who we are and who have not yet been formed. This information can be used to search and receive new employees from which the department will be formed.
 -- FULL JOIN  8  SELECT emp.ID,emp.Name,emp.DepartmentID,dep.ID,dep.Name FROM Employees emp FULL JOIN Departments dep ON emp.DepartmentID=dep.ID 
This query is important when we need to get all the data on employees and all the data on the existing departments. Accordingly, we get holes (NULL-values) either by employees or by departments (freelancers).
This request, for example, can be used to check whether all employees are in the right departments, because maybe some employees who are listed as freelancers simply forgot to specify a department.
 -- CROSS JOIN  30  - (6   Employees) * (5   Departments) SELECT emp.ID,emp.Name,emp.DepartmentID,dep.ID,dep.Name FROM Employees emp CROSS JOIN Departments dep 
In this form, it is even difficult to think of where this can be applied, so I will show the example with CROSS JOIN below.

Note that in case of repeating the DepartmentID values ​​in the Employees table, each such row was connected to a row from the Departments table with the same ID, that is, the Departments data merged with all the records for which the condition was fulfilled (emp.DepartmentID = dep.ID):



In our case, everything turned out right, i.e. we supplemented the Employees table with data from the Departments table. I specifically drew attention to this, because there are times when we do not need this behavior. For the demonstration, we set the task - for each department to bring out the last employee who has been received, if there are no employees, then simply print the name of the department. Perhaps such a solution suggests itself - just take the previous query and change the connection condition to RIGHT JOIN, plus rearrange the fields in some places:

 SELECT dep.ID,dep.Name,emp.ID,emp.Name FROM Employees emp RIGHT JOIN Departments dep ON emp.DepartmentID=dep.ID 

IDNameIDName
oneAdministration1000Ivanov I.I.
2Accounting1002Sidorov S.S.
3IT1001Petrov P.P.
3IT1003Andreev A.A.
3IT1004Nikolaev N.N.
fourMarketing and advertisingNullNull
fiveLogisticsNullNull

But for the IT department, we got three lines, when we needed only a line with the last employee we took, i.e. Nikolayev N.N.

A task of this kind can be solved, for example, by using a subquery:

 SELECT dep.ID,dep.Name,emp.ID,emp.Name FROM Employees emp /*      ( - MAX(ID))      (GROUP BY DepartmentID) */ JOIN ( SELECT MAX(ID) MaxEmployeeID FROM Employees GROUP BY DepartmentID ) lastEmp ON emp.ID=lastEmp.MaxEmployeeID RIGHT JOIN Departments dep ON emp.DepartmentID=dep.ID --   Departments 

IDNameIDName
oneAdministration1000Ivanov I.I.
2Accounting1002Sidorov S.S.
3IT1004Nikolaev N.N.
fourMarketing and advertisingNullNull
fiveLogisticsNullNull

With the help of pre-combining Employees with subquery data, we were able to leave only the employees we need to connect to Departments.

Here we smoothly turn to the use of subqueries. I think using them in this form should be clear to you on an intuitive level. That is, the subquery is substituted into the place of the table and plays its role, nothing complicated. We will return to the topic of subqueries separately.

See separately what the subquery returns:

 SELECT MAX(ID) MaxEmployeeID FROM Employees GROUP BY DepartmentID 

MaxEmployeeID
1005
1000
1002
1004

Those. he returned only the identifiers of the last employees he received, in the context of the departments.

Connections are performed sequentially from top to bottom, building up like a snowball that rolls down a mountain. First, the “Employees emp JOIN (Subquery) lastEmp” is connected, forming a new output set:



Then comes the combination of the set obtained by “Employees emp JOIN (Subquery) lastEmp” (let's call it conditionally “LastResult”) with Departments, i.e. “Last Result RIGHT JOIN Departments dep”:



Independent work to secure the material


If you are a beginner, then you definitely need to work through each JOIN-design, until you understand 100% how each type of connection works and correctly represent the result of what kind will be obtained as a result.

To consolidate the material about the JOIN-connection, we will do the following:

 --   LeftTable  RightTable TRUNCATE TABLE LeftTable TRUNCATE TABLE RightTable GO --       INSERT LeftTable(LCode,LDescr)VALUES (1,'L-1'), (2,'L-2a'), (2,'L-2b'), (3,'L-3'), (5,'L-5') INSERT RightTable(RCode,RDescr)VALUES (2,'B-2a'), (2,'B-2b'), (3,'B-3'), (4,'B-4') 


Let's see what is in the tables:

 SELECT * FROM LeftTable 

LCodeLdescr
oneL-1
2L-2a
2L-2b
3L-3
fiveL-5

 SELECT * FROM RightTable 

RCodeRDescr
2B-2a
2B-2b
3B-3
fourB-4


Now try to figure out for yourself how each line of the query turned out with each type of connection (Excel to help you):

 SELECT l.*,r.* FROM LeftTable l JOIN RightTable r ON l.LCode=r.RCode 

LCodeLdescrRCodeRDescr
2L-2a2B-2a
2L-2a2B-2b
2L-2b2B-2a
2L-2b2B-2b
3L-33B-3


 SELECT l.*,r.* FROM LeftTable l LEFT JOIN RightTable r ON l.LCode=r.RCode 

LCodeLdescrRCodeRDescr
oneL-1NullNull
2L-2a2B-2a
2L-2a2B-2b
2L-2b2B-2a
2L-2b2B-2b
3L-33B-3
fiveL-5NullNull


 SELECT l.*,r.* FROM LeftTable l RIGHT JOIN RightTable r ON l.LCode=r.RCode 

LCodeLdescrRCodeRDescr
2L-2a2B-2a
2L-2b2B-2a
2L-2a2B-2b
2L-2b2B-2b
3L-33B-3
NullNullfourB-4


 SELECT l.*,r.* FROM LeftTable l FULL JOIN RightTable r ON l.LCode=r.RCode 

LCodeLdescrRCodeRDescr
oneL-1NullNull
2L-2a2B-2a
2L-2a2B-2b
2L-2b2B-2a
2L-2b2B-2b
3L-33B-3
fiveL-5NullNull
NullNullfourB-4


 SELECT l.*,r.* FROM LeftTable l CROSS JOIN RightTable r 

LCodeLdescrRCodeRDescr
oneL-12B-2a
2L-2a2B-2a
2L-2b2B-2a
3L-32B-2a
fiveL-52B-2a
oneL-12B-2b
2L-2a2B-2b
2L-2b2B-2b
3L-32B-2b
fiveL-52B-2b
oneL-13B-3
2L-2a3B-3
2L-2b3B-3
3L-33B-3
fiveL-53B-3
oneL-1fourB-4
2L-2afourB-4
2L-2bfourB-4
3L-3fourB-4
fiveL-5fourB-4


Once again about JOIN connections


Another example using multiple sequential join operations. Here, the repetition was not intentional, it turned out - not to throw away the same material. ;) But nothing "repetition - the mother of learning."

If several join operations are used, then they are applied sequentially from top to bottom. Roughly speaking, after each connection a new set is created and the next connection is already happening with this extended set. Consider a simple example:

 SELECT e.ID, e.Name EmployeeName, p.Name PositionName, d.Name DepartmentName FROM Employees e LEFT JOIN Departments d ON e.DepartmentID=d.ID LEFT JOIN Positions p ON e.PositionID=p.ID 


First of all, we selected all the entries in the Employees table:

 SELECT e.* FROM Employees e -- 1 


Next was a connection to the Departments table:

 SELECT e.*, --   Employees d.* --   (e.DepartmentID=d.ID)  Departments FROM Employees e -- 1 LEFT JOIN Departments d ON e.DepartmentID=d.ID -- 2 


Next is the connection of this set with the table Positions:

 SELECT e.*, --   Employees d.*, --   (e.DepartmentID=d.ID)  Departments p.* --   (e.PositionID=p.ID)  Positions FROM Employees e -- 1 LEFT JOIN Departments d ON e.DepartmentID=d.ID -- 2 LEFT JOIN Positions p ON e.PositionID=p.ID -- 3 


Those. it looks like this:



And last but not least, we are returning the data that we ask to display:

 SELECT e.ID, -- 1.   e.Name EmployeeName, -- 2.   p.Name PositionName, -- 3.   d.Name DepartmentName -- 4.   FROM Employees e LEFT JOIN Departments d ON e.DepartmentID=d.ID LEFT JOIN Positions p ON e.PositionID=p.ID 


Accordingly, the WHERE filter and ORDER BY sorting can be applied to all of this resulting set:

 SELECT e.ID, -- 1.   e.Name EmployeeName, -- 2.   p.Name PositionName, -- 3.   d.Name DepartmentName -- 4.   FROM Employees e LEFT JOIN Departments d ON e.DepartmentID=d.ID LEFT JOIN Positions p ON e.PositionID=p.ID WHERE d.ID=3 --     ID  Departments AND p.ID=3 --     ID  Positions ORDER BY e.Name --     Name  Employees 

IDEmployeeNamePositionNameDepartmentName
1004Nikolaev N.N.ProgrammerIT
1001Petrov P.P.ProgrammerIT

That is, the last set received is the same table over which you can perform a basic query:

 SELECT [DISTINCT] _  * FROM  WHERE  ORDER BY _ 


That is, if earlier only one table acted as a source, now we simply substitute our expression for this place:

 Employees e LEFT JOIN Departments d ON e.DepartmentID=d.ID LEFT JOIN Positions p ON e.PositionID=p.ID 


As a result, we get the same basic query:

 SELECT e.ID, e.Name EmployeeName, p.Name PositionName, d.Name DepartmentName FROM /*  -  */ Employees e LEFT JOIN Departments d ON e.DepartmentID=d.ID LEFT JOIN Positions p ON e.PositionID=p.ID /*  -  */ WHERE d.ID=3 AND p.ID=3 ORDER BY e.Name 


And now, apply grouping:

 SELECT ISNULL(dep.Name,'') DepName, COUNT(DISTINCT emp.PositionID) PositionCount, COUNT(*) EmplCount, SUM(emp.Salary) SalaryAmount, AVG(emp.Salary) SalaryAvg --     FROM /*  -  */ Employees emp LEFT JOIN Departments dep ON emp.DepartmentID=dep.ID /*  -  */ GROUP BY emp.DepartmentID,dep.Name ORDER BY DepName 


You see, we are still spinning around the base structures, now I hope it’s clear why it’s very important to understand them first of all.

And as we have seen, in a query, a subquery can be in place of any table. In turn, subqueries can be nested in subqueries. And all these subqueries are basic constructs too. That is, the basic design, these are the bricks from which any query is built.

CROSS JOIN promised example


Let's use the CROSS JOIN connection to calculate how many employees, in which department, and in which positions. For each department we will list all existing posts:

 SELECT d.Name DepartmentName, p.Name PositionName, e.EmplCount FROM Departments d CROSS JOIN Positions p LEFT JOIN ( /*           (DepartmentID,PositionID) */ SELECT DepartmentID,PositionID,COUNT(*) EmplCount FROM Employees GROUP BY DepartmentID,PositionID ) e ON e.DepartmentID=d.ID AND e.PositionID=p.ID ORDER BY DepartmentName,PositionName 




In this case, the connection was first performed using a CROSS JOIN, and then a connection was made to the received set with the data from the subquery using the LEFT JOIN. Instead of a table in the LEFT JOIN, we used a subquery.

, «e». , :

 SELECT DepartmentID,PositionID,COUNT(*) EmplCount FROM Employees GROUP BY DepartmentID,PositionID 

DepartmentIDPositionIDEmplCount
NullNullone
2oneone
one2one
332
3fourone

«e» DepartmentID, PositionID EmplCount. , . , ,
, , .

WHERE-


JOIN-:

 SELECT emp.ID,emp.Name,emp.DepartmentID,dep.ID,dep.Name FROM Employees emp JOIN Departments dep ON emp.DepartmentID=dep.ID --    WHERE emp.DepartmentID=3 --    


WHERE- :

 SELECT emp.ID,emp.Name,emp.DepartmentID,dep.ID,dep.Name FROM Employees emp, Departments dep WHERE emp.DepartmentID=dep.ID --    AND emp.DepartmentID=3 --    


, (emp.DepartmentID=dep.ID) (emp.DepartmentID=3).

, CROSS JOIN:

 SELECT emp.ID,emp.Name,emp.DepartmentID,dep.ID,dep.Name FROM Employees emp CROSS JOIN Departments dep --   (  ) WHERE emp.DepartmentID=3 --    


WHERE- :

 SELECT emp.ID,emp.Name,emp.DepartmentID,dep.ID,dep.Name FROM Employees emp, Departments dep WHERE emp.DepartmentID=3 --    


Those. Employees Departments. ? , - « , (emp.DepartmentID=dep.ID)» , , . , .. CROSS JOIN. , , , , CROSS JOIN.

For the query optimizer, it can be no matter how you implement the connection (using WHERE or JOIN), it can do them exactly the same. But for reasons of understanding the code, I would recommend in modern DBMS to try never to make the connection of tables using WHERE conditions. To use WHERE conditions for the connection, in the event that the JOIN constructions are implemented in the DBMS, I would now call it the moveton. WHERE conditions are used to filter the set, and there is no need to mix the conditions used for the connection with the conditions responsible for filtering. But if you have come to the conclusion that you cannot do without implementing a connection via WHERE, then of course the priority for the task is solved and “to hell with all foundations”.

UNION unions - operations of vertical unification of query results


, .. , .

:

 SELECT '' Info, COUNT(DISTINCT PositionID) PositionCount, COUNT(*) EmplCount, SUM(Salary) SalaryAmount FROM Employees WHERE DepartmentID=1 --    SELECT '' Info, COUNT(DISTINCT PositionID) PositionCount, COUNT(*) EmplCount, SUM(Salary) SalaryAmount FROM Employees WHERE DepartmentID=2 --    SELECT '' Info, COUNT(DISTINCT PositionID) PositionCount, COUNT(*) EmplCount, SUM(Salary) SalaryAmount FROM Employees WHERE DepartmentID=3 --     SELECT '' Info, COUNT(DISTINCT PositionID) PositionCount, COUNT(*) EmplCount, SUM(Salary) SalaryAmount FROM Employees WHERE DepartmentID IS NULL --        


, , , , UNION ALL, :

 SELECT '' Info, COUNT(DISTINCT PositionID) PositionCount, COUNT(*) EmplCount, SUM(Salary) SalaryAmount FROM Employees WHERE DepartmentID=1 --    UNION ALL SELECT '' Info, COUNT(DISTINCT PositionID) PositionCount, COUNT(*) EmplCount, SUM(Salary) SalaryAmount FROM Employees WHERE DepartmentID=2 --    UNION ALL SELECT '' Info, COUNT(DISTINCT PositionID) PositionCount, COUNT(*) EmplCount, SUM(Salary) SalaryAmount FROM Employees WHERE DepartmentID=3 --     UNION ALL SELECT '' Info, COUNT(DISTINCT PositionID) PositionCount, COUNT(*) EmplCount, SUM(Salary) SalaryAmount FROM Employees WHERE DepartmentID IS NULL --        




Those. UNION ALL , .

, , .. , , ..

Some theory


MS SQL :
OperationDescription
UNION ALL. (A+B)
UNION. DISTINCT(A+B)
EXCEPT, . 2- . DISTINCT(AB)
INTERSECT, . 2- . DISTINCT(A&B)

.

2 :

 CREATE TABLE TopTable( T1 int, T2 varchar(10) ) GO CREATE TABLE BottomTable( B1 int, B2 varchar(10) ) GO INSERT TopTable(T1,T2)VALUES (1,'Text 1'), (1,'Text 1'), (2,'Text 2'), (3,'Text 3'), (4,'Text 4'), (5,'Text 5') INSERT BottomTable(B1,B2)VALUES (2,'Text 2'), (3,'Text 3'), (6,'Text 6'), (6,'Text 6') 


:

 SELECT * FROM TopTable 

T1T2
oneText 1
oneText 1
2Text 2
3Text 3
fourText 4
fiveText 5

 SELECT * FROM BottomTable 

B1B2
2Text 2
3Text 3
6Text 6
6Text 6


UNION ALL


 SELECT T1 x,T2 y FROM TopTable UNION ALL SELECT B1,B2 FROM BottomTable 




UNION


 SELECT T1 x,T2 y FROM TopTable UNION SELECT B1,B2 FROM BottomTable 


UNION , UNION ALL, DISTINCT:



EXCEPT


 SELECT T1 x,T2 y FROM TopTable EXCEPT SELECT B1,B2 FROM BottomTable 




INTERSECT


 SELECT T1 x,T2 y FROM TopTable INTERSECT SELECT B1,B2 FROM BottomTable 




UNION-


That's basically all that concerns vertical joins, it is much easier than JOIN joins.

Most often, UNION ALL is used in my practice, but other types of vertical unions also find their application.

With several operations vertically merging, it is not guaranteed that they will be performed sequentially from top to bottom. Create another table and consider it with an example:

 CREATE TABLE NextTable( N1 int, N2 varchar(10) ) GO INSERT NextTable(N1,N2)VALUES (1,'Text 1'), (4,'Text 4'), (6,'Text 6') 


For example, if we write simply:

 SELECT T1 x,T2 y FROM TopTable EXCEPT SELECT B1,B2 FROM BottomTable INTERSECT SELECT N1,N2 FROM NextTable 


Then we get:
xy
oneText 1
2Text 2
3Text 3
fourText 4
fiveText 5


Those. INTERSECT, EXCEPT. , .. -.

, , , , , , , EXCEPT, INTERSECT:

 ( SELECT T1 x,T2 y FROM TopTable EXCEPT SELECT B1,B2 FROM BottomTable ) INTERSECT SELECT N1,N2 FROM NextTable 

xy
oneText 1
fourText 4

, .

, :

 SELECT x,y FROM ( SELECT T1 x,T2 y FROM TopTable EXCEPT SELECT B1,B2 FROM BottomTable ) q INTERSECT SELECT N1,N2 FROM NextTable 


ORDER BY :

 SELECT T1 x,T2 y FROM TopTable UNION ALL SELECT B1,B2 FROM BottomTable UNION ALL SELECT B1,B2 FROM BottomTable ORDER BY x DESC 


, .

UNION- , UNION- .

Note. Oracle , EXCEPT, MINUS.



, .. . .

FROM. , . , . 2- :

 SELECT q1.x1,q1.y1,q2.x2,q2.y2 FROM ( SELECT T1 x1,T2 y1 FROM TopTable EXCEPT SELECT B1,B2 FROM BottomTable ) q1 JOIN ( SELECT T1 x2,T2 y2 FROM TopTable EXCEPT SELECT N1,N2 FROM NextTable ) q2 ON q1.x1=q2.x2 


, , . Those. , «q1», , «q2», JOIN «q1» «q2».

WITH


.

Compare:

 SELECT q1.x1,q1.y1,q2.x2,q2.y2 FROM ( SELECT T1 x1,T2 y1 FROM TopTable EXCEPT SELECT B1,B2 FROM BottomTable ) q1 JOIN ( SELECT T1 x2,T2 y2 FROM TopTable EXCEPT SELECT N1,N2 FROM NextTable ) q2 ON q1.x1=q2.x2 


The same is written with WITH:

 WITH q1 AS( SELECT T1 x1,T2 y1 FROM TopTable EXCEPT SELECT B1,B2 FROM BottomTable ), q2 AS( SELECT T1 x2,T2 y2 FROM TopTable EXCEPT SELECT N1,N2 FROM NextTable ) --      SELECT q1.x1,q1.y1,q2.x2,q2.y2 FROM q1 JOIN q2 ON q1.x1=q2.x2 


As you can see, large subqueries are rendered and named in the WITH block, which made it possible to unload the text of the main query and make it understandable.

Let us also recall the example from the previous part, where the ViewEmployeesInfo view was used:

 CREATE VIEW ViewEmployeesInfo AS SELECT emp.*, --     Employees dep.Name DepartmentName, --      Name   Departments pos.Name PositionName --     Name   Positions FROM Employees emp LEFT JOIN Departments dep ON emp.DepartmentID=dep.ID LEFT JOIN Positions pos ON emp.PositionID=pos.ID 


And the query that used this presentation:

 SELECT DepartmentName, COUNT(DISTINCT PositionID) PositionCount, COUNT(*) EmplCount, SUM(Salary) SalaryAmount, AVG(Salary) SalaryAvg FROM ViewEmployeesInfo emp GROUP BY DepartmentID,DepartmentName ORDER BY DepartmentName 


In essence, WITH allows us to place text from the view directly in the request, i.e. the meaning is the same:

 WITH cteEmployeesInfo AS( SELECT emp.*, --     Employees dep.Name DepartmentName, --      Name   Departments pos.Name PositionName --     Name   Positions FROM Employees emp LEFT JOIN Departments dep ON emp.DepartmentID=dep.ID LEFT JOIN Positions pos ON emp.PositionID=pos.ID ) SELECT DepartmentName, COUNT(DISTINCT PositionID) PositionCount, COUNT(*) EmplCount, SUM(Salary) SalaryAmount, AVG(Salary) SalaryAvg FROM cteEmployeesInfo emp GROUP BY DepartmentID,DepartmentName ORDER BY DepartmentName 


Only in the case of a created view, we can use it from different requests, since The view is created at the database level. Whereas the subquery issued in the WITH block is visible only within the framework of this query.

WITH - CTE-:
(CTE — Common Table Expressions) , . CTE , , .

CTE , .


. ( , Employees , ).

 WITH cteEmpl AS( SELECT ID,CAST(Name AS nvarchar(300)) Name,1 EmpLevel FROM Employees WHERE ManagerID IS NULL --       UNION ALL SELECT emp.ID,CAST(SPACE(cte.EmpLevel*5)+emp.Name AS nvarchar(300)),cte.EmpLevel+1 FROM Employees emp JOIN cteEmpl cte ON emp.ManagerID=cte.ID ) SELECT * FROM cteEmpl 

IDNameEmpLevel
1000Ivanov I.I.one
1002_____ ..2
1003_____ ..2
1005_____ ..2
1001__________ ..3
1004__________ ..3

.

, , . , , . , , . , , .


Let's now look at how you can still use subqueries, and also pass parameters to them using an alias from the main query.

Here I will no longer go deeply into the explanation, since By this stage you should have already learned to think and understand the principle of working with data. Be sure to practice, follow the examples and try and understand the result. To understand, you need to feel each example itself.

The subquery can be used in the SELECT block


Let's return to our report:

 SELECT DepartmentID, COUNT(DISTINCT PositionID) PositionCount, COUNT(*) EmplCount, SUM(Salary) SalaryAmount, AVG(Salary) SalaryAvg --     FROM Employees GROUP BY DepartmentID 


Here the department name can also be obtained using a subquery with the parameter:

 SELECT /*                 */ (SELECT Name FROM Departments dep WHERE dep.ID=emp.DepartmentID) DepartmentName, COUNT(DISTINCT PositionID) PositionCount, COUNT(*) EmplCount, SUM(Salary) SalaryAmount, AVG(Salary) SalaryAvg FROM Employees emp --   GROUP BY DepartmentID ORDER BY DepartmentName 


(SELECT Name FROM Departments dep WHERE dep.ID=emp.DepartmentID) 4 , .. emp.DepartmentID

. , TOP, - , . , ID :

 SELECT ID, Name, --  1 -  ID  (SELECT MAX(ID) FROM Employees emp WHERE emp.DepartmentID=dep.ID) LastEmpID_var1, --  1 -  ID  (SELECT TOP 1 ID FROM Employees emp WHERE emp.DepartmentID=dep.ID ORDER BY ID DESC) LastEmpID_var2, --  2 -    (SELECT TOP 1 Name FROM Employees emp WHERE emp.DepartmentID=dep.ID ORDER BY ID DESC) LastEmpName FROM Departments dep 


? Since 4 ( ), 12 .

, , .. , .. . .

APPLY


MS SQL :

 SELECT ID, Name, --  1 -  ID  (SELECT TOP 1 ID FROM Employees emp WHERE emp.DepartmentID=dep.ID ORDER BY ID DESC) LastEmpID, --  2 -    (SELECT TOP 1 Name FROM Employees emp WHERE emp.DepartmentID=dep.ID ORDER BY ID DESC) LastEmpName FROM Departments dep 


APPLY, 2 – CROSS APPLY OUTER APPLY.

APPLY , , ID Name :

 SELECT ID, Name, empInfo.LastEmpID, empInfo.LastEmpName FROM Departments dep CROSS APPLY ( SELECT TOP 1 ID LastEmpID,Name LastEmpName FROM Employees emp WHERE emp.DepartmentID=dep.ID ORDER BY emp.ID DESC ) empInfo 

IDNameLastEmpIDLastEmpName
oneAdministration1000Ivanov I.I.
2Accounting1002..
3IT1004..

CROSS APPLY Departments. , .

, Departments, OUTER APPLY:

 SELECT ID, Name, empInfo.LastEmpID, empInfo.LastEmpName FROM Departments dep OUTER APPLY ( SELECT TOP 1 ID LastEmpID,Name LastEmpName FROM Employees emp WHERE emp.DepartmentID=dep.ID ORDER BY emp.ID DESC ) empInfo 

IDNameLastEmpIDLastEmpName
oneAdministration1000Ivanov I.I.
2Accounting1002..
3IT1004..
fourNullNull
fiveLogisticsNullNull

, , . , .. , , . APPLY, , , , . , , :

 SELECT dep.ID,dep.Name,pos.PositionID,pos.PositionName FROM Departments dep CROSS APPLY ( SELECT ID PositionID,Name PositionName FROM Positions ) pos 


WHERE


, :

 SELECT * FROM Departments dep WHERE (SELECT COUNT(*) FROM Employees emp WHERE emp.DepartmentID=dep.ID)>2 


, , .. SELECT.

EXISTS and NOT EXISTS constructs


Allow to check whether there are records corresponding to the condition in the subquery:

 --         SELECT * FROM Departments dep WHERE EXISTS(SELECT * FROM Employees emp WHERE emp.DepartmentID=dep.ID) 


 --        SELECT * FROM Departments dep WHERE NOT EXISTS(SELECT * FROM Employees emp WHERE emp.DepartmentID=dep.ID) 


Everything is simple here - EXISTS returns True if the subquery returns at least one row, and False if the subquery does not return rows. NOT EXISTS - inversion of the result.

IN and NOT IN subquery construction


Before that, we looked at IN with a listing of values. You can also use it with a subquery that returns a list of these values:

 --     SELECT * FROM Departments WHERE ID IN(SELECT DISTINCT DepartmentID FROM Employees WHERE DepartmentID IS NOT NULL) 


 --     SELECT * FROM Departments WHERE ID NOT IN(SELECT DISTINCT DepartmentID FROM Employees WHERE DepartmentID IS NOT NULL) 


Note that I excluded the NULL value using the condition (DepartmentID IS NOT NULL) in the subquery. NULL values ​​in this case are just as dangerous - see this in the description of the IN construct in the second part.

Group comparison operations ALL and ANY


, . , IN EXISTS.

ALL ANY , , . , EXISTS .

, . ALL:

 SELECT ID,Name,DepartmentID,Salary FROM Employees e1 WHERE e1.Salary>ALL( SELECT e2.Salary FROM Employees e2 WHERE e2.DepartmentID=e1.DepartmentID --       AND e2.ID<>e1.ID --        AND e2.Salary IS NOT NULL --  NULL  ) 

IDNameDepartmentIDSalary
1000Ivanov I.I.one5000
1002..22500
1003..32000
1005..Null2000

, e1.Salary e2.Salary, .

, , ? – , , . ))) .

, , ALL NOT EXISTS:

 SELECT ID,Name,DepartmentID,Salary FROM Employees e1 WHERE NOT EXISTS( SELECT * FROM Employees e2 WHERE e2.DepartmentID=e1.DepartmentID --       AND e2.Salary>e1.Salary --        ) 


Those. « ».

ALL .

, ALL NULL- , . ALL AND, .. (Salary>1000 AND Salary>1500 AND Salary>NULL) NULL.

ANY ( SOME) -:

 SELECT ID,Name,DepartmentID,Salary FROM Employees e1 WHERE e1.Salary>ANY( -- ANY = SOME SELECT e2.Salary FROM Employees e2 WHERE e2.DepartmentID=e1.DepartmentID --       AND e2.ID<>e1.ID --        ) 

IDNameDepartmentIDSalary
1003..32000

C ANY , , . Since , -, .., . Those. , .

, ANY EXISTS:

 SELECT ID,Name,DepartmentID,Salary FROM Employees e1 WHERE EXISTS( SELECT * FROM Employees e2 WHERE e2.DepartmentID=e1.DepartmentID --       AND e2.Salary<e1.Salary --            ) 


« - ».

, ANY , .

NULL- , .. . ANY OR, .. (Salary>1000 OR Salary>1500 OR Salary>NULL) .

ANY , IN:

 SELECT * FROM Departments WHERE ID=ANY(SELECT DISTINCT DepartmentID FROM Employees) 


, . :

 SELECT * FROM Departments WHERE ID IN(SELECT DISTINCT DepartmentID FROM Employees) 


ALL ANY . , . Those. « »:

 SELECT * FROM Employees e1 WHERE e1.Salary>ALL(SELECT e2.Salary FROM Employees e2 WHERE e2.ID<>e1.ID AND e2.Salary IS NOT NULL) 


« »:

 SELECT * FROM Employees e1 WHERE NOT EXISTS(SELECT * FROM Employees e2 WHERE e2.Salary>e1.Salary) 


, SQL , .


, , HAVING, CASE. , .

SELECT, , , .

. , SELECT , .. , , , ( -), . , , , , .

, – , .. . , , . , .

Conclusion


SELECT. , , , .

, SQL, 10 , ( Paradox). , SQL . , , -. , - , . , , .

In the next part, I will discuss in general terms the data modification operators. In general terms, since this information, like knowledge of DDL, is not needed by everyone (mostly IT professionals) - most people learn SQL precisely in order to learn how to sample data using the SELECT statement. I think the next part will be final. All knowledge gained up to this point will also be useful to us in the next part, since for the correct writing of complex structures for data modification, you need to confidently use the constructions of the SELECT operator. For example, before deleting or modifying a group of rows of a table, we must correctly select this data. Therefore, the next part will also contain SELECT constructs and I think it will be of interest to those people who learn SQL precisely because of the SELECT statement.

, , . «SQL-EX.RU – SQL», , . SQL. , , .

, :

« 1 – SELECT» « 2 – OVER ». , MSDN.

.

— habrahabr.ru/post/256169

Source: https://habr.com/ru/post/256045/


All Articles