Categories: TSQL

Parsing JSON in SQL Server with OPENJSON

Recently I had a piece of JSON of which I needed to extract data from to put into a table in Azure SQL DB. Unfortunately, the structure of the JSON itself was too complex to map automatically in Azure Data Factory. It might be possible though, but after a few tries to map it manually and every time ending up with NULL records, I decided I was just going to load the data as-is into a table and then use the OPENJSON function to parse it.

It’s the first time I had to use the function and while there are tons of examples online, must of them are quite straight forward. So I though to blog about a (more) complex example. The JSON had the following structure (which is already simplified):

[
	{
		"root": {
			"DataServices": {
				"Schema": [
					{
						"EntityType": [
							{
								"Property": [
									{
										"@Name": "column1",
										"@Type": "Double",
										"@Nullable": false
									},
									{
										"@Name": "column2",
										"@Type": "String",
										"@Nullable": false
									},
									{
										"@Name": "column3",
										"@Type": "Int32",
										"@Nullable": true
									},
									{
										"@Name": "Created",
										"@Type": "DateTime",
										"@Nullable": false
									},
									{
										"@Name": "Creator",
										"@Type": "Guid",
										"@Nullable": true
									}
								],
								"@Name": "tableA"
							},
							{
								"Property": [
									{
										"@Name": "column1",
										"@Type": "Double",
										"@Nullable": false
									},
									{
										"@Name": "column2",
										"@Type": "Double",
										"@Nullable": false
									},
									{
										"@Name": "column3",
										"@Type": "Guid",
										"@Nullable": true
									}
								],
								"@Name": "tableB"
							}
						],
						"Association": {
							"End": [
								{
									"@Role": "tableA",
									"@Type": "type1",
									"@Multiplicity": "*"
								},
								{
									"@Role": "tableB",
									"@Type": "type2",
									"@Multiplicity": "*"
								}
							],
							"@Name": "tableA_tableB"
						}
					},
					{
						"EntityContainer": {
							"EntitySet": [
								{
									"@Name": "tableA",
									"@EntityType": "type1"
								},
								{
									"@Name": "tableB",
									"@EntityType": "type2"
								}
							],
							"AssociationSet": {
								"End": [
									{
										"@Role": "tableA",
										"@EntitySet": "myTableA"
									},
									{
										"@Role": "tableB",
										"@EntitySet": "myTableB"
									}
								],
								"@Name": "somerandomname"
							}
						},
						"@Namespace": "myNamespace"
					}
				],
				"@DataServiceVersion": 1.0
			},
			"@Version": 1.0
		}
	}
];

The original JSON was the result set of a REST API call that extracted the metadata of that API endpoint. So it returns an array of tables, and each array contains the list of columns, their data types and if the column is nullable or not. Let’s tear it apart, piece by piece. The data we want are the name, type and nullable fields inside the Property array, which is in itself inside the EntityType array, which is again in the Schema array.

With OPENJSON, you can specify a WITH clause that allows you to specify which part of the JSON you want to access. You reference the part you want to extract by traversing the hierarchy in a dotted notation. You start with $. and then add the elements you want to traverse. So to get to the Schema node, we use $.”root”.”DataServices”.Schema. If there are any shenanigans in the names of nodes, you put double quotes around them. Let’s use this method to extract the Schema array:

DECLARE @myJSON NVARCHAR(MAX);
SET @myJSON = N'[
	{
		"root": {
			"DataServices": {
				"Schema": [
					{
						"EntityType": [
							{
								"Property": [
									{
										"@Name": "column1",
										"@Type": "Double",
										"@Nullable": false
									},
									{
										"@Name": "column2",
										"@Type": "String",
										"@Nullable": false
									},
									{
										"@Name": "column3",
										"@Type": "Int32",
										"@Nullable": true
									},
									{
										"@Name": "Created",
										"@Type": "DateTime",
										"@Nullable": false
									},
									{
										"@Name": "Creator",
										"@Type": "Guid",
										"@Nullable": true
									}
								],
								"@Name": "tableA"
							},
							{
								"Property": [
									{
										"@Name": "column1",
										"@Type": "Double",
										"@Nullable": false
									},
									{
										"@Name": "column2",
										"@Type": "Double",
										"@Nullable": false
									},
									{
										"@Name": "column3",
										"@Type": "Guid",
										"@Nullable": true
									}
								],
								"@Name": "tableB"
							}
						],
						"Association": {
							"End": [
								{
									"@Role": "tableA",
									"@Type": "type1",
									"@Multiplicity": "*"
								},
								{
									"@Role": "tableB",
									"@Type": "type2",
									"@Multiplicity": "*"
								}
							],
							"@Name": "tableA_tableB"
						}
					},
					{
						"EntityContainer": {
							"EntitySet": [
								{
									"@Name": "tableA",
									"@EntityType": "type1"
								},
								{
									"@Name": "tableB",
									"@EntityType": "type2"
								}
							],
							"AssociationSet": {
								"End": [
									{
										"@Role": "tableA",
										"@EntitySet": "myTableA"
									},
									{
										"@Role": "tableB",
										"@EntitySet": "myTableB"
									}
								],
								"@Name": "somerandomname"
							}
						},
						"@Namespace": "myNamespace"
					}
				],
				"@DataServiceVersion": 1.0
			},
			"@Version": 1.0
		}
	}
];';

SELECT *
FROM OPENJSON(@myJSON)
WITH(
    [Schema]            NVARCHAR(MAX) '$."root"."DataServices".Schema' AS JSON
) m;

This gives the following result:

[
	{
		"EntityType": [
			{
				"Property": [
					{
						"@Name": "column1",
						"@Type": "Double",
						"@Nullable": false
					},
					{
						"@Name": "column2",
						"@Type": "String",
						"@Nullable": false
					},
					{
						"@Name": "column3",
						"@Type": "Int32",
						"@Nullable": true
					},
					{
						"@Name": "Created",
						"@Type": "DateTime",
						"@Nullable": false
					},
					{
						"@Name": "Creator",
						"@Type": "Guid",
						"@Nullable": true
					}
				],
				"@Name": "tableA"
			},
			{
				"Property": [
					{
						"@Name": "column1",
						"@Type": "Double",
						"@Nullable": false
					},
					{
						"@Name": "column2",
						"@Type": "Double",
						"@Nullable": false
					},
					{
						"@Name": "column3",
						"@Type": "Guid",
						"@Nullable": true
					}
				],
				"@Name": "tableB"
			}
		],
		"Association": {
			"End": [
				{
					"@Role": "tableA",
					"@Type": "type1",
					"@Multiplicity": "*"
				},
				{
					"@Role": "tableB",
					"@Type": "type2",
					"@Multiplicity": "*"
				}
			],
			"@Name": "tableA_tableB"
		}
	},
	{
		"EntityContainer": {
			"EntitySet": [
				{
					"@Name": "tableA",
					"@EntityType": "type1"
				},
				{
					"@Name": "tableB",
					"@EntityType": "type2"
				}
			],
			"AssociationSet": {
				"End": [
					{
						"@Role": "tableA",
						"@EntitySet": "myTableA"
					},
					{
						"@Role": "tableB",
						"@EntitySet": "myTableB"
					}
				],
				"@Name": "somerandomname"
			}
		},
		"@Namespace": "myNamespace"
	}
]

Using CROSS APPLY, we can apply the OPENJSON function again on the resulting JSON fragment. This allows us to extract the EntityType array:

SELECT *
FROM OPENJSON(@myJSON)
WITH(
    [Schema]            NVARCHAR(MAX) '$."root"."DataServices".Schema' AS JSON
) m
CROSS APPLY OPENJSON(m.[Schema])
WITH(
     EntityType         NVARCHAR(MAX) '$.EntityType' AS JSON
--    ,EntityContainer    NVARCHAR(MAX) '$.EntityContainer' AS JSON
) s;

Two records are returned, one for the EntityType array and one for the EntityContainer (since it’s commented out, it returns NULL instead).

[
	{
		"Property": [
			{
				"@Name": "column1",
				"@Type": "Double",
				"@Nullable": false
			},
			{
				"@Name": "column2",
				"@Type": "String",
				"@Nullable": false
			},
			{
				"@Name": "column3",
				"@Type": "Int32",
				"@Nullable": true
			},
			{
				"@Name": "Created",
				"@Type": "DateTime",
				"@Nullable": false
			},
			{
				"@Name": "Creator",
				"@Type": "Guid",
				"@Nullable": true
			}
		],
		"@Name": "tableA"
	},
	{
		"Property": [
			{
				"@Name": "column1",
				"@Type": "Double",
				"@Nullable": false
			},
			{
				"@Name": "column2",
				"@Type": "Double",
				"@Nullable": false
			},
			{
				"@Name": "column3",
				"@Type": "Guid",
				"@Nullable": true
			}
		],
		"@Name": "tableB"
	}
]

Using CROSS APPLY again, we can extract the Property array:

SELECT *
FROM OPENJSON(@myJSON)
WITH(
    [Schema]            NVARCHAR(MAX) '$."root"."DataServices".Schema' AS JSON
) m
CROSS APPLY OPENJSON(m.[Schema])
WITH(
     EntityType         NVARCHAR(MAX) '$.EntityType' AS JSON
--    ,EntityContainer    NVARCHAR(MAX) '$.EntityContainer' AS JSON
) s
CROSS APPLY OPENJSON(s.EntityType)
WITH(
     PropertyList       NVARCHAR(MAX) '$.Property' AS JSON
    ,EntityTypeName     NVARCHAR(50) '$."@Name"'
) e

Because we use CROSS APPLY, the NULL line of EntityContainer will be filtered away. We get two rows, because there are two tables in our result set (tableA and tableB).

[
	{
		"@Name": "column1",
		"@Type": "Double",
		"@Nullable": false
	},
	{
		"@Name": "column2",
		"@Type": "String",
		"@Nullable": false
	},
	{
		"@Name": "column3",
		"@Type": "Int32",
		"@Nullable": true
	},
	{
		"@Name": "Created",
		"@Type": "DateTime",
		"@Nullable": false
	},
	{
		"@Name": "Creator",
		"@Type": "Guid",
		"@Nullable": true
	}
]

and 

[
	{
		"@Name": "column1",
		"@Type": "Double",
		"@Nullable": false
	},
	{
		"@Name": "column2",
		"@Type": "Double",
		"@Nullable": false
	},
	{
		"@Name": "column3",
		"@Type": "Guid",
		"@Nullable": true
	}
]

We need to apply OPENJSON one more time to extract the desired fields.

SELECT
     e.EntityTypeName
    ,p.PropertyName
    ,p.PropertyType
    ,p.PropertyNullable
FROM OPENJSON(@myJSON)
WITH(
    [Schema]            NVARCHAR(MAX) '$."root"."DataServices".Schema' AS JSON
) m
CROSS APPLY OPENJSON(m.[Schema])
WITH(
     EntityType         NVARCHAR(MAX) '$.EntityType' AS JSON
--    ,EntityContainer    NVARCHAR(MAX) '$.EntityContainer' AS JSON
) s
CROSS APPLY OPENJSON(s.EntityType)
WITH(
     PropertyList       NVARCHAR(MAX) '$.Property' AS JSON
    ,EntityTypeName     NVARCHAR(50) '$."@Name"'
) e
CROSS APPLY OPENJSON(e.PropertyList)
WITH(
     PropertyName       NVARCHAR(50) '$."@Name"'
    ,PropertyType       NVARCHAR(50) '$."@Type"'
    ,PropertyNullable   NVARCHAR(50) '$."@Nullable"'
) p;

The final result set:


------------------------------------------------
Do you like this blog post? You can thank me by buying me a beer 🙂
Koen Verbeeck

Koen Verbeeck is a Microsoft Business Intelligence consultant at AE, helping clients to get insight in their data. Koen has a comprehensive knowledge of the SQL Server BI stack, with a particular love for Integration Services. He's also a speaker at various conferences.

Recent Posts

Book Review – Agile Data Warehouse Design

I recently read the book Agile Data Warehouse Design - Collaborative Dimensional Modeling, from Whiteboard…

1 month ago

Cloudbrew 2024 – Slides

You can find the slides for the session Building the €100 data warehouse with the…

1 month ago

Book Review – Microsoft Power BI Performance Best Practices

I was asked to do a review of the book Microsoft Power BI Performance Best…

2 months ago

Create a Numbers Table in Power Query

This is a quick blog post, mainly so I have the code available if I…

2 months ago

Microsoft finally adds Tenant Switcher for Fabric / Power BI

Praise whatever deity you believe in, because it's finally here, a tenant switcher for Microsoft…

2 months ago

Book Review – Humanizing Data Strategy by Tiankai Feng

This book was making its rounds on social media, and the concept seems interesting enough…

2 months ago