Sometimes it becomes necessary to aggregate strings in a SQL query, that is, for such a data set:
Groupid | Item |
---|
one | AAA |
2 | IS |
five | OMG |
2 | WHAT |
2 | THE |
one | This |
get something like this:
Groupid | Itemlist |
---|
one | AAA, This |
2 | IS, WHAT, THE |
five | OMG |
MySQL, for example, has the built-in
GROUP_CONCAT () function for such purposes:
SELECT GroupId, GROUP_CONCAT (Item SEPARATOR ",") AS ItemList
FROM Items
In MS SQL Server, there is no such function, so you have to pervert. Before proceeding, we will create a script to create a test table:
CREATE TABLE Items (GroupId INT, Item NVARCHAR (10))
INSERT INTO Items (GroupId, Item)
SELECT 1 AS GroupId, 'AAA' AS Item
UNION ALL
SELECT 2, 'IS'
UNION ALL
SELECT 5, 'OMG'
UNION ALL
SELECT 2, 'WHAT'
UNION ALL
SELECT 2, 'THE'
UNION ALL
SELECT 1, 'This'
So, let's begin.
The stupidest straightforward way is
to create a temporary table and collect intermediate aggregation results into it by running the cursor over the Items table. This method works very slowly and its code is terrible. Enjoy:
DECLARE @Aggregated TABLE (GroupId INT, ItemList NVARCHAR (100))
DECLARE ItemsCursor CURSOR READ_ONLY
FOR SELECT GroupId, Item
FROM Items
DECLARE @CurrentGroupId INT
DECLARE @CurrentItem NVARCHAR (10)
DECLARE @CurrentItemList NVARCHAR (100)
OPEN ItemsCursor
FETCH NEXT FROM ItemsCursor
INTO @CurrentGroupId, @CurrentItem
WHILE @@ FETCH_STATUS = 0
BEGIN
SET @CurrentItemList = (SELECT ItemList
FROM @Aggregated
WHERE GroupId = @CurrentGroupId)
IF @CurrentItemList IS NULL
INSERT INTO @Aggregated (GroupId, ItemList)
VALUES (@CurrentGroupId, @CurrentItem)
ELSE
UPDATE @Aggregated
SET ItemList = ItemList + ',' + @CurrentItem
WHERE GroupId = @CurrentGroupId
FETCH NEXT FROM ItemsCursor
INTO @CurrentGroupId, @CurrentItem
END
CLOSE ItemsCursor
DEALLOCATE ItemsCursor
SELECT GroupId, ItemList
FROM @Aggregated
There is a more beautiful way, not using temporary tables. It is based on the
SELECT trick
var = var + ',' + col FROM smwhere . Yes, it can and it works:
CREATE FUNCTION ConcatItems (@GroupId INT)
RETURNS NVARCHAR (100)
AS
BEGIN
DECLARE @ItemList varchar (8000)
SET @ItemList = ''
SELECT @ItemList = @ItemList + ',' + Item
FROM Items
WHERE GroupId = @GroupId
RETURN SUBSTRING (@ItemList, 2, 100)
END
GO
SELECT GroupId, dbo.ConcatItems (GroupId) ItemList
FROM Items
GROUP BY GroupId
A bit better, but still a crutch. In the case when we know that the
maximum number of aggregated rows is limited , we can use the following method (this query is based on the assumption that there is no group with more than four elements in it):
SELECT GroupId,
CASE Item2 WHEN '' THEN Item1
ELSE CASE Item3 WHEN '' THEN Item1 + ',' + Item2
ELSE CASE Item4 WHEN '' THEN Item1 + ',' + Item2 + ',' + Item3
ELSE Item1 + ',' + Item2 + ',' + Item3 + ',' + Item4
END END EN AS ItemList
FROM (
SELECT GroupId,
MAX (CASE ItemNo WHEN 1 THEN Item ELSE '' END) AS Item1,
MAX (CASE ItemNo WHEN 2 THEN Item ELSE '' END) AS Item2,
MAX (CASE ItemNo WHEN 3 THEN Item ELSE '' END) AS Item3,
MAX (CASE ItemNo WHEN 4 THEN Item ELSE '' END) AS Item4
FROM (
SELECT GroupId,
Item,
ROW_NUMBER () OVER (PARTITION BY GroupId ORDER BY Item) ItemNo
FROM Items
) AS OrderedItems
GROUP BY GroupId
) AS AlmostAggregated
Yes, a lot of code. But on the other hand, not a single extra object in the database is just one pure select. This is sometimes important.
However, there is a way to circumvent the size of the group, while remaining within the same request. We will
collect all the elements of the group in an XML field , which we then convert to string type and replace the tags between the elements with commas:
SELECT GroupId,
REPLACE (SUBSTRING (ItemListWithTags, 4, LEN (ItemListWithTags) -7),
'<a>',
',') AS ItemList
FROM (
SELECT GroupId,
CAST (XmlItemList AS NVARCHAR (200)) ItemListWithTags
FROM (
SELECT GroupId,
(SELECT Item AS A
FROM Items ii
WHERE ii.GroupId = GroupIds.GroupId
FOR XML PATH ('')) AS XmlItemList
FROM (SELECT DISTINCT GroupId FROM Items) AS GroupIds
) AS subq1
) AS subq2
In general, it does not work very fast, but always. And, of course, you need a SQL Server not lower than 2000.
Yes, there is still a way to aggregate strings through the
CLR Aggregate Functions , but this is generally a dark horror, because death is slow and incompatible with the complexity of the task. If there is sufficient demand for such an article, I will write it later.
')
I look forward to comments and criticism. And another thing: if someone knows how to make such a highlight of the code, as I did
at home , tell
me . I, besides pasting screenshots, have no other way yet.