📜 ⬆️ ⬇️

A magic keyword - VALUES ...

The INSERT syntax may seem rather trivial, since the T-SQL standard considered the VALUES keyword only in the context of data insertion - INSERT INTO ... VALUES ....

With the release of SQL Server 2008 , the syntax of T-SQL has been significantly expanded, making it possible to use the multi-line VALUES construct, and not only in the context of an insert.

This topic will consider the comparative efficiency of using the VALUES construction in various typical situations. To give an objective assessment of the results obtained, for each example, its implementation plan will be considered.

For clarity, we will create and fill in a test data table, which will contain information about the average ball among students in the context of quarters.
')
IF OBJECT_ID('dbo.GradePointAverage', 'U') IS NOT NULL DROP TABLE dbo.GradePointAverage GO CREATE TABLE dbo.GradePointAverage ( StudentID INT , I SMALLINT NOT NULL , II SMALLINT NOT NULL , III SMALLINT NOT NULL , IV SMALLINT NOT NULL , CONSTRAINT PK_GradePointAverage PRIMARY KEY (StudentID) ) INSERT INTO dbo.GradePointAverage (StudentID, I, II, III, IV) SELECT sv.number, sv.number % 94, sv.number % 83, sv.number % 72, sv.number % 61 FROM [master].dbo.spt_values sv WHERE sv.type = 'P' AND sv.number BETWEEN 1 AND 2000 

Suppose you want to know the minimum and maximum score for each of the students.

To make the comparison more interesting, each of the proposed approaches will be performed in different situations: 1) when the table has a primary key and 2) when the table is an unordered heap.

First, we give the most unfortunate example of implementation:

 SELECT StudentID , MaxGradePoint = MAX(GradePoint) , MinGradePoint = MIN(GradePoint) FROM ( SELECT StudentID, GradePoint = I FROM dbo.GradePointAverage UNION ALL SELECT StudentID, II FROM dbo.GradePointAverage UNION ALL SELECT StudentID, III FROM dbo.GradePointAverage UNION ALL SELECT StudentID, IV FROM dbo.GradePointAverage ) t GROUP BY StudentID 

Instead of reading the data once, the reference to the source table occurs 4 times - this, to put it mildly, is not rational:



Let's try to get rid of repeated readings using the UNPIVOT construct:

 SELECT StudentID , MaxGradePoint = MAX(GradePoint) , MinGradePoint = MIN(GradePoint) FROM ( SELECT * FROM dbo.GradePointAverage UNPIVOT ( GradePoint FOR Grade IN (I, II, III, IV) ) unpvt ) t GROUP BY StudentID 

Repeated readings are gone, but the plan became more complicated:



In this case, in the case when the table does not have a clustered index, the server has to use sorting to organize the data.

Let's see how the VALUES construction behaves:

 SELECT gpa.StudentID , t.MaxGradePoint , t.MinGradePoint FROM dbo.GradePointAverage gpa CROSS APPLY ( SELECT MaxGradePoint = MAX(GradePoint) , MinGradePoint = MIN(GradePoint) FROM ( VALUES (I), (II), (III), (IV) ) t (GradePoint) ) t 

When conditions change, the plan remains very simple and unchanged:



The Query Cost value obtained from SSMS also clearly confirms the advantages of the VALUES construction:





The use of the VALUES construct is not limited to the task of converting rows to columns. Another effective application of this design is formatted query output.

Immediately I will clarify that I do not consider this a good practice, since this task should be performed on the client. However, such tasks were often useful, for example, in developing reports.

Suppose for each source line you need to output data in the following form:



You can solve this problem by reading the data several times using UNION ALL with sorting:

 SELECT StudentID , GradePoint , AverageGradePoint FROM ( SELECT ID = StudentID, StudentID, GradePoint = I, AverageGradePoint = (I + II + III + IV) / 4., RN = 1 FROM dbo.GradePointAverage UNION ALL SELECT StudentID, NULL, II, NULL, 2 FROM dbo.GradePointAverage UNION ALL SELECT StudentID, NULL, III, NULL, 3 FROM dbo.GradePointAverage UNION ALL SELECT StudentID, NULL, IV, NULL, 4 FROM dbo.GradePointAverage ) t ORDER BY ID, RN 

Again we get repeated readings. In this case, pay attention to the sorting, in the case when the table does not have a clustered index:



Alternatively, you can return to the construction of UNPIVOT , while checking the line number:

 SELECT StudentID = CASE WHEN RN = 1 THEN StudentID END , GradePoint , AverageGradePoint = CASE WHEN RN = 1 THEN AverageGradePoint END FROM ( SELECT StudentID , GradePoint , AverageGradePoint , RN = ROW_NUMBER() OVER (PARTITION BY StudentID ORDER BY 1/0) FROM ( SELECT *, AverageGradePoint = (I + II + III + IV) / 4. FROM dbo.GradePointAverage ) gpa UNPIVOT ( GradePoint FOR Grade IN (I, II, III, IV) ) unpvt ) t 

Repeated readings are gone, but the sorting using the heap has not disappeared anywhere:



Apply the VALUES construct by writing a more elegant query:

 SELECT t.* FROM dbo.GradePointAverage OUTER APPLY ( VALUES (StudentID, I, (I + II + III + IV) / 4.) , (NULL, II, NULL) , (NULL, III, NULL) , (NULL, IV, NULL) ) t (StudentID, GradePoint, AverageGradePoint) 

We got a simple and effective execution plan:



According to Query Cost , the VALUES design once again demonstrates its effectiveness in comparison with its rivals:





Brief conclusions:

The VALUES design is not a complete replacement for UNPIVOT , however, in some situations, it can be very useful - allowing you to significantly simplify queries.

I hope that I succeeded, clearly, to demonstrate this.

Source: https://habr.com/ru/post/197904/


All Articles