In the wake of
this post. I hope the author will have nothing against my broadening his conclusions.
Inside the description (pros / cons) of the transfer methods, the table and the comparison chart. The following parameter transfer methods are considered:
- Xml (Openxml)
- Xml (xquery)
- Line
- Blob
- Intermediate table
- Table type
For those who are too lazy to read, output and the resulting graph below.
Test
According to the conditions of the test, it is required to transfer the list by some method to a stored procedure, where you should convert it into a usable t-sql data structure (temporary table). The boundary conditions are simplified to the maximum:
- The stored procedure only converts the list (no application logic)
- The list is not empty
- All list items have the same length and type.
')
Guid (uniqueidentifier) ​​was taken as the data type. Test data are lists (guid arrays) with lengths of 1, 10, 50, 100, 1000, 5000, 10000, 50000, 100000, 250000, 500000, 1000000. The result of the test is the time spent executing the stored procedure, the functionality of generating ADO control objects .NET and list conversion to a suitable form for transfer to the procedure. The result of the first array (length 1) is discarded as a runner.
Description of methods
Xml (Openxml)
The array is converted to an xml structure and passed to the stored procedure as an xml data type. Inside the stored procedure is inserted into a temporary table from the openxml function:
declare @idoc int
exec sp_xml_preparedocument @idoc OUTPUT, @xml
insert into @table(id)
select value
from openxml(@idoc, '/list/id', 1) with (value uniqueidentifier)
Pros:
- The ability to cast to the desired type of data directly in the stored procedure.
- Flexibility in parameter queries, i.e. there is an opportunity to transfer the data connected through hierarchy of elements xml (with all that it implies)
Minuses:
- Data redundancy in the form of hml tags
Xml (XQuery)
Everything is the same as Xml (Openxml), but the data structure expands to tsql types using xquery:
declare @t table(id uniqueidentifier)
insert into @t(id)
select list.Id.value('@val','nvarchar(255)') from @xml.nodes('/list/id[@val]') as list(Id)
The pros and cons are the same as in the previous method. It should be noted that the programmer must have some skills in building XPath \ XQuery queries.
Line
The "classic" way to transfer data is by sticking it together with a string to some separator:
-- ,
declare @next int = 1;
while @next > 0
begin
if (@next > 1) set @next = @next + 1; --
insert into @table(id) values(cast(substring(@data, @next, @elementLength) as uniqueidentifier));
set @next = CHARINDEX(@delimiter, @data, @next + 1); -- todo: @elementLength ?
end
Pros:
- The easiest algorithm in terms of implementation
- A diverse number of development approaches
Minuses:
- Classics of the genre: “It passes the lists to the stored procedure through the line?!?! Fu !!! Lamer! "
- Item bypass data
- More data transferred
Blob
Same as a string, but instead of a string, the element is encoded by a sequence of bytes. The bytes are combined into a set and passed to the stored procedure as varbinary (max):
while @cnt < @len
begin
insert into @table(id) values(CAST(SUBSTRING(@data, @cnt, @elementLen) as uniqueidentifier))
set @cnt = @cnt + @elementLen;
end
Pros:
- Compared to a string, less data is transferred.
- Faster data conversion
Minuses:
- Difficulty in implementation
Intermediate table
An intermediate table table (setId, id) is created in the database, into which through SqlBulkCopy an array of elements and its identifier (setId) are inserted.
DataTable data = new DataTable();
data.Columns.Add("SetId", typeof(Guid));
data.Columns.Add("Id", typeof(Guid));
Guid setId = Guid.NewGuid();
foreach (Guid x in _array)
{
var row = data.NewRow();
row["SetId"] = setId;
row["Id"] = x;
data.Rows.Add(row);
}
using (SqlBulkCopy bulk = new SqlBulkCopy(Settings.Default.ConnectionString, SqlBulkCopyOptions.CheckConstraints))
{
bulk.BulkCopyTimeout = 60000;
bulk.DestinationTableName = "dbo.bulktable";
bulk.WriteToServer(data);
}
According to the setId identifier from the intermediate table, the necessary list is “pulled out”:
insert into @table
select Id
from dbo.BulkTable
where SetId = @setId;
Pros:
- Bulk operations are very fast.
- “Simplicity” of the stored procedure
Minuses:
- All cons of a global variable in the form of an intermediate table
- The method will slow down as the table is filled with parallel queries.
- The need to maintain and maintain an index for quick search
- Two actions instead of one: once - we write to the intermediate, two - we call the storage. The gap between these actions is not controlled and can cause problems (see concurrency, etc)
Table type
In the database, the user defined type is declared in the form of a table with the required data type. It can be passed to the stored procedure via the System.Data.DataTable object. In the stored procedure, this type will be visible as a t-sql table to which queries can be made.
CREATE TYPE [dbo].[guidlist] AS TABLE([id] [uniqueidentifier] NULL)
CREATE procedure [dbo].[PassTableParam]
@data guidlist readonly
as
begin
set nocount on;
declare @table table(id uniqueidentifier)
insert into @table(id)
select id
from @data
end
Pros:
- The easiest way to pass a parameter to a stored procedure
Minuses:
- Works only from MS SQL 2008 version.
Test results
It was performed on dual-core Xeon and 8GB of RAM.
In the form of a table (in the first column the length of the list, in the rest of the execution time of the stored procedure in milliseconds). Estimated is the gain in the execution time of the method relative to the execution time of other methods for the specified list.
Legend:
- Xml - Openxml
- Xml - Xquery
- string - string
- binary - Blob
- bulk - Intermediate table
- table - table type
count
| binary
| bulk
| openxml
| string
| table
| xquery
|
ten
| one
| five
| 2
| one
| one
| one
|
50
| 2
| five
| 2
| 3
| one
| one
|
100
| four
| 6
| five
| five
| 2
| 2
|
1000
| thirty
| 24
| 46
| 44
| 65
| sixteen
|
5000
| 144
| 85
| 228
| 216
| 35
| 78
|
10,000
| 289
| 167
| 476
| 633
| 67
| 165
|
50,000
| 1928
| 727
| 2482
| 3170
| 398
| 698
|
100,000
| 3886
| 1630
| 4803
| 6388
| 891
| 1467
|
250,000
| 9893
| 3757
| 12074
| 16649
| 2082
| 3698
|
500,000
| 19357
| 8427
| 24265
| 33398
| 4460
| 7099
|
1,000,000
| 38838
| 20832
| 53508
| 67825
| 9040
| 14444
|
In the form of a graph:

findings
If you have MS SQL 2008, then the lists should be transferred in the form of table variables. If the server version is less than 2008, then the xml \ xquery approach should be used.
The method of the intermediate table and bulk is inefficient in view of its minuses and the test results. Even though in this example the approach with a string can be optimized, it and the rest of the methods are ineffective.