Friday, June 12, 2020

SQL Server: OPENJSON to explode array in JSON format

Version: SQL Server 2019 Developer Edition


Sometimes, I come across cases where list of values are stored in JSON(JavaScript Object Notation) format. Examples include: 
  • Values from multiple selected/checked items from forms.
  • Survey related data. (ie. questions and unstructured answers)
Since the beginning of SQL Server 2016, SQL Server has had capabilities to handle JSON data format. Today, I will post a memo about how to explode array in JSON values. I created a sample table to save into JSON format data. It includes the field("Skills) which allows only JSON format. I put CHECK constraint to make sure the target string has valid JSON format.

CREATE TABLE json_test (
	Id	         INTEGER IDENTITY(1,1) PRIMARY KEY,
	FirstName	 VARCHAR(64),
	LastName	 VARCHAR(64),
	Email		 VARCHAR(64),
	Skills		 VARCHAR(MAX) CHECK (ISJSON(Skills)>0)
)

Some samples inserted
INSERT INTO json_test VALUES
('Mike', 'Lee', 'ml@email.com', N'{"skills": ["Excel", "TSQL", "MS Word"]}'),
('Robert', 'Sean', 'rs@email.com', N'{"skills": ["TSQL", "Python", "Spark"]}'),
('Ken', 'Macy', 'km@email.com', N'{"skills": ["PL/SQL", "TSQL", "Access"]}'),
('Lisa', 'Pang', 'lp@email.com', N'{"skills": ["Excel", "PowerShell"]}')

JSON Sample








Let's assume that our goal to report about the frequency distribution of skills. The "Skills" field has a list of skills in array format in JSON. To accomplish our goal, we need to first explode this list of skills. 

OPENJSON() function helps with parsing JSON text. When we run the query below, the output table displays the objects and properties: "key", "value", and "type" fields.

SELECT *
  FROM json_test AS jt
 CROSS APPLY OPENJSON(jt.skills) 










According to MS Doc(link) "type = 4" means array data type. This means that OPENJSON() operator can expands the array of values and second arguments(path) can handle this task.

SELECT *
  FROM json_test AS jt
 CROSS APPLY OPENJSON(jt.skills, '$.skills') 

After we explode the array, we can add aggregate queries to reach our goal which is to report about the frequency distribution of skills like below.

SELECT value, COUNT(*)
  FROM json_test AS jt
 CROSS APPLY OPENJSON(jt.skills, '$.skills') 
 GROUP BY value
 ORDER BY COUNT(*) DESC















JSON is a popular data format used in web applications. Some NoSQL database use this JSON format to store unstructured data. It is common these days to see companies operating on both relational and NoSQL conceptual databases while, at the same time, managing data with both structured and complex structured ways. JSON functions in SQL Server provide developers with more flexible options to handle data.

No comments:

Post a Comment