📜 ⬆️ ⬇️

New in Caché 2013.1 DBMS: Adding and Generating Indices on Live Classes

Suppose that you have a table with a large number of records and you need to add one or more indexes to it with the following conditions:

  1. their generation should be as fast as possible
  2. so that generation can be done in portions.
    For example, if there is a table for 300M records and working with it can only be done during off-hours, then the whole process can be broken up into three nights of 100M records.
  3. the emergence of new indexes and the process of their generation should not interfere with the current work with the class / table

For this, we could use the already known % BuildIndices () method, but in this case it will not satisfy our conditions.

What is the way out?

Theory


The new class % Library.IndexBuilder with the one, but powerful % ConstructIndicesParallel () method has been added to the Caché 2013.1 database version.
From the name it is already becoming clear that generation will occur in parallel with the involvement of all processor cores.
')
So, consider the parameters of this method in more detail:

ClassMethod % ConstructIndicesParallel ( pTaskId = "" , pStartId As% Integer = 0 , pEndId As% Integer = -1 , pSortBegin As% Integer = 1 , pDroneCount As% Integer = 0 , pLockFlag As% Integer = 1 , pJournalFlag%% 1 ) as% Status



Practice


Now consider an example of applying a new class.

To begin with, we will create an educational class in the USER area, fill it with 1M records with variable length lines [1-100] and construct an index using the classic % BuildIndices () , so that we can compare:

Class demo.test Extends% Persistent
{

Index idxn On n As SQLUPPER ( 6 );

Property n As% String ( MAXLEN = 100 );

ClassMethod Fill ( n As% Integer = 10,000,000 )
{
set data = $ Replace ( $ Justify ( "" , 100), "" , "a" )
set time = $ ZHorolog
do DISABLE ^% NOJRN
do .. % KillExtent ()
set ^ demo.testD = n
set ^ demo.testD (1) = $ ListBuild ( "" , $ Extract ( data , 1, $ Random (100) +1))
for i = 2: 1: n set ^ ( i ) = $ ListBuild ( "" , $ Extract ( data , 1, $ Random (100) +1))
do ENABLE ^% NOJRN
write "insert =" , $ ZHorolog - time , "sec." !
}

ClassMethod BIndex ()
{
set time = $ ZHorolog
do .. % BuildIndices (, 1,1)
write "reindex =" , $ ZHorolog - time , "sec." !
}

}

My results:
USER>do ##class(demo.test).Fill()
= 9.706935 .

USER>do ##class(demo.test).BIndex()
= 71.966953 .

Now use the new % IndexBuilder class. To do this, follow these steps:

  1. First, clear the index data from the previous test using the % PurgeIndices () method (optional step)
  2. inherit our class from % IndexBuilder
  3. we list the indices separated by commas in the INDEXBUILDERFILTER parameter.
    If this parameter is left empty, all indexes will be regenerated.
  4. we will make our index invisible for SQL, so that the optimizer does not use an index that is not yet ready for work.
    For this, we use the $ SYSTEM.SQL.SetMapSelectability () method:

    ClassMethod SetMapSelectability ( pTablename As% Library.String = "" , pMapname As% Library.String = "" , pValue As% Boolean = "" ) as% Library.String

    Argument Description:
    • pTablename - table name
    • pMapname - index name
    • pValue - 0/1 flag, defining the visibility (1) or invisibility (0) of the index for the SQL optimizer
    Note: You can make the index invisible long before it is added to the class.
  5. call the % ConstructIndicesParallel () method
  6. make our index visible for SQL
  7. Profit!

As a result, our class will take the following form:

Class demo.test Extends ( % Persistent , % IndexBuilder )
{

Parameter INDEXBUILDERFILTER = "idxn" ;

Parameter BITMAPCHUNKINMEMORY = 0 ;

Index idxn On n As SQLUPPER ( 6 );

Property n As% String ( MAXLEN = 100 );
...
ClassMethod FastBIndex ()
{
do .. % PurgeIndices ( $ ListBuild ( "idxn" ))
do $ SYSTEM .SQL . SetMapSelectability ( $ classname (), "idxn" , $$$ NO )
do .. % ConstructIndicesParallel (,,, 1,, 2.0)
do $ SYSTEM .SQL . SetMapSelectability ( $ classname (), "idxn" , $$$ YES )
}
}

My results:
USER>do ##class(demo.test).FastBIndex()

Building 157 chunks and will use parallel build algorithm with 4 drone processes.
SortBegin is requested.
Started drone process: 3812
Started drone process: 4284
Started drone process: 7004
Started drone process: 7224
Expected time to complete is 43 secs to build 157 chunks of 64,000 objects using 4 processes.
Waiting for processes to complete....done.
Elapsed time using 4 processes was 34.906643.

As you can see, the speed has doubled.

On your hardware and on your data, the results can get even better.

Even faster?


But is it possible to further accelerate the regeneration of indices?
If you have a lot of RAM left, then yes.

In the process of generating indices by the designer for internal needs, so-called bitmap blocks are temporarily formed. By default, they are written to private globals, but using the BITMAPCHUNKINMEMORY boolean parameter you can specify that they are formed in RAM. To do this, assign the parameter to 1.
Note that if RAM is allocated a little, and the indices are large, then you can get the error <STORE> .
By default, BITMAPCHUNKINMEMORY is 0.

Source: https://habr.com/ru/post/173739/


All Articles