Categories: TSQL

Parsing JSON in SQL Server with OPENJSON

Recently I had a piece of JSON of which I needed to extract data from to put into a table in Azure SQL DB. Unfortunately, the structure of the JSON itself was too complex to map automatically in Azure Data Factory. It might be possible though, but after a few tries to map it manually and every time ending up with NULL records, I decided I was just going to load the data as-is into a table and then use the OPENJSON function to parse it.

It’s the first time I had to use the function and while there are tons of examples online, must of them are quite straight forward. So I though to blog about a (more) complex example. The JSON had the following structure (which is already simplified):

[
	{
		"root": {
			"DataServices": {
				"Schema": [
					{
						"EntityType": [
							{
								"Property": [
									{
										"@Name": "column1",
										"@Type": "Double",
										"@Nullable": false
									},
									{
										"@Name": "column2",
										"@Type": "String",
										"@Nullable": false
									},
									{
										"@Name": "column3",
										"@Type": "Int32",
										"@Nullable": true
									},
									{
										"@Name": "Created",
										"@Type": "DateTime",
										"@Nullable": false
									},
									{
										"@Name": "Creator",
										"@Type": "Guid",
										"@Nullable": true
									}
								],
								"@Name": "tableA"
							},
							{
								"Property": [
									{
										"@Name": "column1",
										"@Type": "Double",
										"@Nullable": false
									},
									{
										"@Name": "column2",
										"@Type": "Double",
										"@Nullable": false
									},
									{
										"@Name": "column3",
										"@Type": "Guid",
										"@Nullable": true
									}
								],
								"@Name": "tableB"
							}
						],
						"Association": {
							"End": [
								{
									"@Role": "tableA",
									"@Type": "type1",
									"@Multiplicity": "*"
								},
								{
									"@Role": "tableB",
									"@Type": "type2",
									"@Multiplicity": "*"
								}
							],
							"@Name": "tableA_tableB"
						}
					},
					{
						"EntityContainer": {
							"EntitySet": [
								{
									"@Name": "tableA",
									"@EntityType": "type1"
								},
								{
									"@Name": "tableB",
									"@EntityType": "type2"
								}
							],
							"AssociationSet": {
								"End": [
									{
										"@Role": "tableA",
										"@EntitySet": "myTableA"
									},
									{
										"@Role": "tableB",
										"@EntitySet": "myTableB"
									}
								],
								"@Name": "somerandomname"
							}
						},
						"@Namespace": "myNamespace"
					}
				],
				"@DataServiceVersion": 1.0
			},
			"@Version": 1.0
		}
	}
];

The original JSON was the result set of a REST API call that extracted the metadata of that API endpoint. So it returns an array of tables, and each array contains the list of columns, their data types and if the column is nullable or not. Let’s tear it apart, piece by piece. The data we want are the name, type and nullable fields inside the Property array, which is in itself inside the EntityType array, which is again in the Schema array.

With OPENJSON, you can specify a WITH clause that allows you to specify which part of the JSON you want to access. You reference the part you want to extract by traversing the hierarchy in a dotted notation. You start with $. and then add the elements you want to traverse. So to get to the Schema node, we use $.”root”.”DataServices”.Schema. If there are any shenanigans in the names of nodes, you put double quotes around them. Let’s use this method to extract the Schema array:

DECLARE @myJSON NVARCHAR(MAX);
SET @myJSON = N'[
	{
		"root": {
			"DataServices": {
				"Schema": [
					{
						"EntityType": [
							{
								"Property": [
									{
										"@Name": "column1",
										"@Type": "Double",
										"@Nullable": false
									},
									{
										"@Name": "column2",
										"@Type": "String",
										"@Nullable": false
									},
									{
										"@Name": "column3",
										"@Type": "Int32",
										"@Nullable": true
									},
									{
										"@Name": "Created",
										"@Type": "DateTime",
										"@Nullable": false
									},
									{
										"@Name": "Creator",
										"@Type": "Guid",
										"@Nullable": true
									}
								],
								"@Name": "tableA"
							},
							{
								"Property": [
									{
										"@Name": "column1",
										"@Type": "Double",
										"@Nullable": false
									},
									{
										"@Name": "column2",
										"@Type": "Double",
										"@Nullable": false
									},
									{
										"@Name": "column3",
										"@Type": "Guid",
										"@Nullable": true
									}
								],
								"@Name": "tableB"
							}
						],
						"Association": {
							"End": [
								{
									"@Role": "tableA",
									"@Type": "type1",
									"@Multiplicity": "*"
								},
								{
									"@Role": "tableB",
									"@Type": "type2",
									"@Multiplicity": "*"
								}
							],
							"@Name": "tableA_tableB"
						}
					},
					{
						"EntityContainer": {
							"EntitySet": [
								{
									"@Name": "tableA",
									"@EntityType": "type1"
								},
								{
									"@Name": "tableB",
									"@EntityType": "type2"
								}
							],
							"AssociationSet": {
								"End": [
									{
										"@Role": "tableA",
										"@EntitySet": "myTableA"
									},
									{
										"@Role": "tableB",
										"@EntitySet": "myTableB"
									}
								],
								"@Name": "somerandomname"
							}
						},
						"@Namespace": "myNamespace"
					}
				],
				"@DataServiceVersion": 1.0
			},
			"@Version": 1.0
		}
	}
];';

SELECT *
FROM OPENJSON(@myJSON)
WITH(
    [Schema]            NVARCHAR(MAX) '$."root"."DataServices".Schema' AS JSON
) m;

This gives the following result:

[
	{
		"EntityType": [
			{
				"Property": [
					{
						"@Name": "column1",
						"@Type": "Double",
						"@Nullable": false
					},
					{
						"@Name": "column2",
						"@Type": "String",
						"@Nullable": false
					},
					{
						"@Name": "column3",
						"@Type": "Int32",
						"@Nullable": true
					},
					{
						"@Name": "Created",
						"@Type": "DateTime",
						"@Nullable": false
					},
					{
						"@Name": "Creator",
						"@Type": "Guid",
						"@Nullable": true
					}
				],
				"@Name": "tableA"
			},
			{
				"Property": [
					{
						"@Name": "column1",
						"@Type": "Double",
						"@Nullable": false
					},
					{
						"@Name": "column2",
						"@Type": "Double",
						"@Nullable": false
					},
					{
						"@Name": "column3",
						"@Type": "Guid",
						"@Nullable": true
					}
				],
				"@Name": "tableB"
			}
		],
		"Association": {
			"End": [
				{
					"@Role": "tableA",
					"@Type": "type1",
					"@Multiplicity": "*"
				},
				{
					"@Role": "tableB",
					"@Type": "type2",
					"@Multiplicity": "*"
				}
			],
			"@Name": "tableA_tableB"
		}
	},
	{
		"EntityContainer": {
			"EntitySet": [
				{
					"@Name": "tableA",
					"@EntityType": "type1"
				},
				{
					"@Name": "tableB",
					"@EntityType": "type2"
				}
			],
			"AssociationSet": {
				"End": [
					{
						"@Role": "tableA",
						"@EntitySet": "myTableA"
					},
					{
						"@Role": "tableB",
						"@EntitySet": "myTableB"
					}
				],
				"@Name": "somerandomname"
			}
		},
		"@Namespace": "myNamespace"
	}
]

Using CROSS APPLY, we can apply the OPENJSON function again on the resulting JSON fragment. This allows us to extract the EntityType array:

SELECT *
FROM OPENJSON(@myJSON)
WITH(
    [Schema]            NVARCHAR(MAX) '$."root"."DataServices".Schema' AS JSON
) m
CROSS APPLY OPENJSON(m.[Schema])
WITH(
     EntityType         NVARCHAR(MAX) '$.EntityType' AS JSON
--    ,EntityContainer    NVARCHAR(MAX) '$.EntityContainer' AS JSON
) s;

Two records are returned, one for the EntityType array and one for the EntityContainer (since it’s commented out, it returns NULL instead).

[
	{
		"Property": [
			{
				"@Name": "column1",
				"@Type": "Double",
				"@Nullable": false
			},
			{
				"@Name": "column2",
				"@Type": "String",
				"@Nullable": false
			},
			{
				"@Name": "column3",
				"@Type": "Int32",
				"@Nullable": true
			},
			{
				"@Name": "Created",
				"@Type": "DateTime",
				"@Nullable": false
			},
			{
				"@Name": "Creator",
				"@Type": "Guid",
				"@Nullable": true
			}
		],
		"@Name": "tableA"
	},
	{
		"Property": [
			{
				"@Name": "column1",
				"@Type": "Double",
				"@Nullable": false
			},
			{
				"@Name": "column2",
				"@Type": "Double",
				"@Nullable": false
			},
			{
				"@Name": "column3",
				"@Type": "Guid",
				"@Nullable": true
			}
		],
		"@Name": "tableB"
	}
]

Using CROSS APPLY again, we can extract the Property array:

SELECT *
FROM OPENJSON(@myJSON)
WITH(
    [Schema]            NVARCHAR(MAX) '$."root"."DataServices".Schema' AS JSON
) m
CROSS APPLY OPENJSON(m.[Schema])
WITH(
     EntityType         NVARCHAR(MAX) '$.EntityType' AS JSON
--    ,EntityContainer    NVARCHAR(MAX) '$.EntityContainer' AS JSON
) s
CROSS APPLY OPENJSON(s.EntityType)
WITH(
     PropertyList       NVARCHAR(MAX) '$.Property' AS JSON
    ,EntityTypeName     NVARCHAR(50) '$."@Name"'
) e

Because we use CROSS APPLY, the NULL line of EntityContainer will be filtered away. We get two rows, because there are two tables in our result set (tableA and tableB).

[
	{
		"@Name": "column1",
		"@Type": "Double",
		"@Nullable": false
	},
	{
		"@Name": "column2",
		"@Type": "String",
		"@Nullable": false
	},
	{
		"@Name": "column3",
		"@Type": "Int32",
		"@Nullable": true
	},
	{
		"@Name": "Created",
		"@Type": "DateTime",
		"@Nullable": false
	},
	{
		"@Name": "Creator",
		"@Type": "Guid",
		"@Nullable": true
	}
]

and 

[
	{
		"@Name": "column1",
		"@Type": "Double",
		"@Nullable": false
	},
	{
		"@Name": "column2",
		"@Type": "Double",
		"@Nullable": false
	},
	{
		"@Name": "column3",
		"@Type": "Guid",
		"@Nullable": true
	}
]

We need to apply OPENJSON one more time to extract the desired fields.

SELECT
     e.EntityTypeName
    ,p.PropertyName
    ,p.PropertyType
    ,p.PropertyNullable
FROM OPENJSON(@myJSON)
WITH(
    [Schema]            NVARCHAR(MAX) '$."root"."DataServices".Schema' AS JSON
) m
CROSS APPLY OPENJSON(m.[Schema])
WITH(
     EntityType         NVARCHAR(MAX) '$.EntityType' AS JSON
--    ,EntityContainer    NVARCHAR(MAX) '$.EntityContainer' AS JSON
) s
CROSS APPLY OPENJSON(s.EntityType)
WITH(
     PropertyList       NVARCHAR(MAX) '$.Property' AS JSON
    ,EntityTypeName     NVARCHAR(50) '$."@Name"'
) e
CROSS APPLY OPENJSON(e.PropertyList)
WITH(
     PropertyName       NVARCHAR(50) '$."@Name"'
    ,PropertyType       NVARCHAR(50) '$."@Type"'
    ,PropertyNullable   NVARCHAR(50) '$."@Nullable"'
) p;

The final result set:


------------------------------------------------
Do you like this blog post? You can thank me by buying me a beer 🙂
Koen Verbeeck

Koen Verbeeck is a Microsoft Business Intelligence consultant at AE, helping clients to get insight in their data. Koen has a comprehensive knowledge of the SQL Server BI stack, with a particular love for Integration Services. He's also a speaker at various conferences.

Recent Posts

Free webinar – Tackling the Gaps and Islands Problem with T-SQL Window Functions

I'm hosting a free webinar at MSSQLTips.com at the 19th of December 2024, 6PM UTC.…

5 days ago

dataMinds Connect 2024 – Session Materials

The slides and scripts for my session "Tackling the Gaps & Islands Problem with T-SQL…

4 weeks ago

Connect to Power BI as a Guest User in another Tenant

Sometimes your Microsoft Entra ID account (formerly known as Azure Active Directory) is added as…

2 months ago

How to use a Script Activity in ADF as a Lookup

In Azure Data Factory (ADF, but also Synapse Pipelines and Fabric Pipelines), you have a…

4 months ago

Database Build Error – Incorrect syntax near DISTINCT

I wrote a piece of SQL that had some new T-SQL syntax in it: IS…

4 months ago

Speaking at dataMinds Connect 2024

I'm very excited to announce I've been selected as a speaker for dataMinds Connect 2024,…

5 months ago