📜 ⬆️ ⬇️

5 Ways to Aggregate Strings in MS SQL

Sometimes it becomes necessary to aggregate strings in a SQL query, that is, for such a data set:
GroupidItem
oneAAA
2IS
fiveOMG
2WHAT
2THE
oneThis
get something like this:
GroupidItemlist
oneAAA, This
2IS, WHAT, THE
fiveOMG
MySQL, for example, has the built-in GROUP_CONCAT () function for such purposes:
  SELECT GroupId, GROUP_CONCAT (Item SEPARATOR ",") AS ItemList
 FROM Items 

In MS SQL Server, there is no such function, so you have to pervert. Before proceeding, we will create a script to create a test table:
  CREATE TABLE Items (GroupId INT, Item NVARCHAR (10))

 INSERT INTO Items (GroupId, Item)
 SELECT 1 AS GroupId, 'AAA' AS Item
   UNION ALL
 SELECT 2, 'IS'
   UNION ALL
 SELECT 5, 'OMG'
   UNION ALL
 SELECT 2, 'WHAT'
   UNION ALL
 SELECT 2, 'THE'
   UNION ALL
 SELECT 1, 'This' 

So, let's begin.

The stupidest straightforward way is to create a temporary table and collect intermediate aggregation results into it by running the cursor over the Items table. This method works very slowly and its code is terrible. Enjoy:
  DECLARE @Aggregated TABLE (GroupId INT, ItemList NVARCHAR (100))

 DECLARE ItemsCursor CURSOR READ_ONLY
 FOR SELECT GroupId, Item
     FROM Items

 DECLARE @CurrentGroupId INT
 DECLARE @CurrentItem NVARCHAR (10)
 DECLARE @CurrentItemList NVARCHAR (100)

 OPEN ItemsCursor

 FETCH NEXT FROM ItemsCursor
 INTO @CurrentGroupId, @CurrentItem

 WHILE @@ FETCH_STATUS = 0
 BEGIN
     SET @CurrentItemList = (SELECT ItemList
                             FROM @Aggregated
                             WHERE GroupId = @CurrentGroupId)

     IF @CurrentItemList IS NULL
         INSERT INTO @Aggregated (GroupId, ItemList)
         VALUES (@CurrentGroupId, @CurrentItem)
     ELSE
         UPDATE @Aggregated
         SET ItemList = ItemList + ',' + @CurrentItem
         WHERE GroupId = @CurrentGroupId

     FETCH NEXT FROM ItemsCursor
     INTO @CurrentGroupId, @CurrentItem
 END

 CLOSE ItemsCursor
 DEALLOCATE ItemsCursor

 SELECT GroupId, ItemList
 FROM @Aggregated 

There is a more beautiful way, not using temporary tables. It is based on the SELECT trick var = var + ',' + col FROM smwhere . Yes, it can and it works:
  CREATE FUNCTION ConcatItems (@GroupId INT)
    RETURNS NVARCHAR (100)
 AS
 BEGIN
     DECLARE @ItemList varchar (8000)
     SET @ItemList = ''

     SELECT @ItemList = @ItemList + ',' + Item
     FROM Items
     WHERE GroupId = @GroupId

     RETURN SUBSTRING (@ItemList, 2, 100)
 END

 GO

 SELECT GroupId, dbo.ConcatItems (GroupId) ItemList
 FROM Items
 GROUP BY GroupId 

A bit better, but still a crutch. In the case when we know that the maximum number of aggregated rows is limited , we can use the following method (this query is based on the assumption that there is no group with more than four elements in it):
  SELECT GroupId,
        CASE Item2 WHEN '' THEN Item1
            ELSE CASE Item3 WHEN '' THEN Item1 + ',' + Item2
            ELSE CASE Item4 WHEN '' THEN Item1 + ',' + Item2 + ',' + Item3
            ELSE Item1 + ',' + Item2 + ',' + Item3 + ',' + Item4
        END END EN AS ItemList
 FROM (
   SELECT GroupId,
        MAX (CASE ItemNo WHEN 1 THEN Item ELSE '' END) AS Item1,
        MAX (CASE ItemNo WHEN 2 THEN Item ELSE '' END) AS Item2,
        MAX (CASE ItemNo WHEN 3 THEN Item ELSE '' END) AS Item3,
        MAX (CASE ItemNo WHEN 4 THEN Item ELSE '' END) AS Item4
   FROM (
     SELECT GroupId,
          Item,
          ROW_NUMBER () OVER (PARTITION BY GroupId ORDER BY Item) ItemNo
     FROM Items
   ) AS OrderedItems
   GROUP BY GroupId
 ) AS AlmostAggregated 

Yes, a lot of code. But on the other hand, not a single extra object in the database is just one pure select. This is sometimes important.

However, there is a way to circumvent the size of the group, while remaining within the same request. We will collect all the elements of the group in an XML field , which we then convert to string type and replace the tags between the elements with commas:
  SELECT GroupId,
        REPLACE (SUBSTRING (ItemListWithTags, 4, LEN (ItemListWithTags) -7),
                '<a>',
                ',') AS ItemList
 FROM (
   SELECT GroupId,
        CAST (XmlItemList AS NVARCHAR (200)) ItemListWithTags
   FROM (
     SELECT GroupId,
          (SELECT Item AS A
         FROM Items ii
         WHERE ii.GroupId = GroupIds.GroupId
         FOR XML PATH ('')) AS XmlItemList
     FROM (SELECT DISTINCT GroupId FROM Items) AS GroupIds
   ) AS subq1
 ) AS subq2 

In general, it does not work very fast, but always. And, of course, you need a SQL Server not lower than 2000.

Yes, there is still a way to aggregate strings through the CLR Aggregate Functions , but this is generally a dark horror, because death is slow and incompatible with the complexity of the task. If there is sufficient demand for such an article, I will write it later.
')
I look forward to comments and criticism. And another thing: if someone knows how to make such a highlight of the code, as I did at home , tell me . I, besides pasting screenshots, have no other way yet.

Source: https://habr.com/ru/post/52504/


All Articles