📜 ⬆️ ⬇️

Undocumented Microsoft SQL Server features: STATISTICS_ONLY, DBCC AUTOPILOT and SET AUTOPILOT

As you know, the SQL Server query optimizer uses a cost estimate to build an optimal query execution plan. SQL Server builds and evaluates multiple plans and selects a minimum cost plan among them.

One of the problems that we occasionally encounter is that in order to understand how a new index will affect the execution of a particular query, we need to create this index. Sometimes, especially when the table is very large, the process of creating an index is so slow that it turns into a real nightmare. Moreover, after 20 minutes of waiting, we may well find that the newly created index, when the query is executed, is not used at all.

Actually, the question is how to create a “hypothetical” index? Just to check whether such an index will be useful when performing a query.

WITH STATISTICS_ONLY

To create a hypothetical index, we can use the undocumented feature of the CREATE INDEX command. For example:
')
USE AdventureWorksDW GO CREATE INDEX ix_FirstName ON DimCustomer(FirstName) WITH STATISTICS_ONLY = -1 GO 

As a result, statistics on this index will be created (a histogram has been constructed and density calculated) and an entry will appear in sys.indexes. You can check this with sp_helpindex and DBCC SHOWSTATISTICS:

 sp_HelpIndex DimCustomer 



 DBCC SHOW_STATISTICS(DimCustomer, ix_FirstName) 


Benjamin Nevares describes these hypothetical indices here .

PS If you create an index using WITH STATISTICS_ONLY = 0, SQL Server will not generate statistics. Only hypothetical index.

DBCC AUTOPILOT and SET AUTOPILOT

Now we have a hypothetical index, how do we use it?

You can try to specify it explicitly using the hint:
 SELECT * FROM DimCustomer WITH(index=ix_FirstName) WHERE FirstName = N'Eugene' 

and get the error:
Msg 308, Level 16, State 1, Line 1
Index 'ix_FirstName' on table 'DimCustomer' (specified in the FROM clause) does not exist.


And if you specify the Index ID?
 SELECT * FROM DimCustomer WITH(index=5) WHERE FirstName = N'Eugene' 

Same:
Msg 307, Level 16, State 1, Line 1
Index ID 5 on table 'DimCustomer' (specified in the FROM clause) does not exist.


So how can we create a query plan that takes this index into account?

This is where the fun begins.

DBCC AUTOPILOT is used to tell the optimizer that when drawing up a plan it is necessary to take into account the existence of a certain index. This DBCC, together with the SET AUTOPILOT ON flag, allows us to use this index.

Let's look at the syntax of this command:

 SET AUTOPILOT ON|OFF /* DBCC TRACEON (2588) DBCC HELP('AUTOPILOT') */ DBCC AUTOPILOT (typeid [, dbid [, {maxQueryCost | tabid [, indid [, pages [, flag [, rowcounts]]]]} ]]) 

We are testing

So, let's see how it all works.

 -- Current Cost = 0,762133 -- Clustered Index Scan on pk SELECT * FROM DimCustomer WHERE FirstName = N'Eugene' GO 



 -- creating the index -- DROP INDEX ix_FirstName ON DimCustomer CREATE INDEX ix_FirstName ON DimCustomer(FirstName) WITH STATISTICS_ONLY = -1 GO -- Looking at the info necessary in the DBCC AUTOPILOT comand SELECT name, id, Indid, Dpages, rowcnt FROM sysindexes WHERE id = object_id('DimCustomer') GO 



 DBCC AUTOPILOT (5, 9, 0, 0, 0, 0, 0) -- Starting with the TypeID 5 DBCC AUTOPILOT (6, 9, 37575172, 1, 0, 0, 0) -- Clustered Index with TypeID 6 DBCC AUTOPILOT (0, 9, 37575172, 2, 0, 0, 0) -- All other index with TypeID 0 DBCC AUTOPILOT (0, 9, 37575172, 3, 0, 0, 0) -- All other index with TypeID 0 DBCC AUTOPILOT (0, 9, 37575172, 5, 0, 0, 0) -- All other index with TypeID 0 GO SET AUTOPILOT ON GO -- Query to create the estimated execution plan with the cost = 0,0750712 SELECT * FROM dbo.DimCustomer WHERE FirstName = N'Eugene' OPTION (RECOMPILE) GO SET AUTOPILOT OFF GO 

approx. of the translator: What the TypeID parameter is, of course, is unknown, in another of his articles, the link to which is below, the same author writes that to use a particular index in the autopilot mode, you need to specify 0


You can also fool the optimizer by passing arbitrary values ​​to DBCC AUTOPILOT as the Pages and RowCount parameters. If you pass them equal to zero, the values ​​used are the same as for the cluster index.

findings

There are enough white spots in the description of these features, but I am sure that this post will be a good starting point for your own tests.

I'm still playing with this thing, so you can calmly ask me questions, or share my discoveries.

And you don’t need to say that you don’t have to use all this on production servers? This is an undocumented opportunity, so no one can tell you exactly what and how it does, until the guys from Microsoft make it officially public and documented.

From the translator:
Joining the author, I remind you that the use of undocumented functionality in a working environment can lead to undesirable consequences.
Also, I want to add another link to the same author: Hypothetical Indexes on SQL Server . This is a newer, enhanced version of the same post. Why did I not translate it? Because they have the same essence, the main difference is that in the new version, it offers the CLR assembly, for more simple use of the DBCC AUTOPILOT, which, in my opinion, is not significant.
All the code and all the screenshots were taken by me from the author, on my SQL Server 2005 SP4 it required minimal doping (for example, using WITH STATISTICS_ONLY instead of WITH STATISTICS_ONLY = -1), so everything should work on SQL Server 2005 and later (namely, starting from 2005 server, used by Database Tuning Advisor, which, presumably, uses this functionality).
As usual, any suggestions and corrections for translation and style are welcome.

Source: https://habr.com/ru/post/168597/


All Articles