Wednesday, June 10, 2020

SQL Server: STRING_AGG append columns to display sample values

Version: SQL Server 2019 Developer Edition
Sample Database: AdventureWorks2017

When validating data from aggregate queries, we want to see or display some sample related primary key, Id, or value. For example, IT manager requests the report of the list of internal computers which have the most suspicious outbound traffic, and see some sample destination address or host name like below format. 

PC Name 
( Field for group by) 
Count of suspicious traffic 
(Aggregate field)
Sample Destination
(Sample records)
 PC_1  300 IP1, IP2...
 PC_2  287 IP1, IP3...
 PC_3  266 IP1, IP4... 
 ... ...   ...

From BI area, maybe there are nice tools and way to display samples on the visualization level. From this page, I post the simple adhoc query with some concept:

SELECT StateProvinceName
       ,City
       ,COUNT(BusinessEntityID) AS CountId
       ,REPLICATE('<3', COUNT(BusinessEntityID)/3)
       ,STRING_AGG(IIF(RN < 3, FullName, null), N',') AS Samples FROM (SELECT StateProvinceName ,City        ,BusinessEntityID ,CONCAT(FirstName, ' ', LastName) AS FullName ,ROW_NUMBER() OVER( PARTITION BY StateProvinceName, City ORDER BY NEWID()) AS RN         FROM AdventureWorks2017.HumanResources.vEmployee) AS M GROUP BY StateProvinceName, City ORDER BY CountId DESC

  • In subquery, ROW_NUMBER() window functions assign a sequential integer each row. 
  • From TSQL, ORDER BY clause is required argument from ROW_NUMBER() function.  From this query, NEWID() function helps pick random samples each execution.
  • STRING_AGG() is aggregate function to help concatenate strings on rows into a single line string (not list). In this example, the query picking two samples (RN < 3) per each group.
  • About REPLICATE() function, it is just for fun. 😁

For more information, MS SQL Doc link is always helpful.



No comments:

Post a Comment