Recently I had a piece of JSON of which I needed to extract data from to put into a table in Azure SQL DB. Unfortunately, the structure of the JSON itself was too complex to map automatically in Azure Data Factory. It might be possible though, but after a few tries to map it manually and every time ending up with NULL records, I decided I was just going to load the data as-is into a table and then use the OPENJSON function to parse it.
It’s the first time I had to use the function and while there are tons of examples online, must of them are quite straight forward. So I though to blog about a (more) complex example. The JSON had the following structure (which is already simplified):
[ { "root": { "DataServices": { "Schema": [ { "EntityType": [ { "Property": [ { "@Name": "column1", "@Type": "Double", "@Nullable": false }, { "@Name": "column2", "@Type": "String", "@Nullable": false }, { "@Name": "column3", "@Type": "Int32", "@Nullable": true }, { "@Name": "Created", "@Type": "DateTime", "@Nullable": false }, { "@Name": "Creator", "@Type": "Guid", "@Nullable": true } ], "@Name": "tableA" }, { "Property": [ { "@Name": "column1", "@Type": "Double", "@Nullable": false }, { "@Name": "column2", "@Type": "Double", "@Nullable": false }, { "@Name": "column3", "@Type": "Guid", "@Nullable": true } ], "@Name": "tableB" } ], "Association": { "End": [ { "@Role": "tableA", "@Type": "type1", "@Multiplicity": "*" }, { "@Role": "tableB", "@Type": "type2", "@Multiplicity": "*" } ], "@Name": "tableA_tableB" } }, { "EntityContainer": { "EntitySet": [ { "@Name": "tableA", "@EntityType": "type1" }, { "@Name": "tableB", "@EntityType": "type2" } ], "AssociationSet": { "End": [ { "@Role": "tableA", "@EntitySet": "myTableA" }, { "@Role": "tableB", "@EntitySet": "myTableB" } ], "@Name": "somerandomname" } }, "@Namespace": "myNamespace" } ], "@DataServiceVersion": 1.0 }, "@Version": 1.0 } } ];
The original JSON was the result set of a REST API call that extracted the metadata of that API endpoint. So it returns an array of tables, and each array contains the list of columns, their data types and if the column is nullable or not. Let’s tear it apart, piece by piece. The data we want are the name, type and nullable fields inside the Property array, which is in itself inside the EntityType array, which is again in the Schema array.
With OPENJSON, you can specify a WITH clause that allows you to specify which part of the JSON you want to access. You reference the part you want to extract by traversing the hierarchy in a dotted notation. You start with $. and then add the elements you want to traverse. So to get to the Schema node, we use $.”root”.”DataServices”.Schema. If there are any shenanigans in the names of nodes, you put double quotes around them. Let’s use this method to extract the Schema array:
DECLARE @myJSON NVARCHAR(MAX); SET @myJSON = N'[ { "root": { "DataServices": { "Schema": [ { "EntityType": [ { "Property": [ { "@Name": "column1", "@Type": "Double", "@Nullable": false }, { "@Name": "column2", "@Type": "String", "@Nullable": false }, { "@Name": "column3", "@Type": "Int32", "@Nullable": true }, { "@Name": "Created", "@Type": "DateTime", "@Nullable": false }, { "@Name": "Creator", "@Type": "Guid", "@Nullable": true } ], "@Name": "tableA" }, { "Property": [ { "@Name": "column1", "@Type": "Double", "@Nullable": false }, { "@Name": "column2", "@Type": "Double", "@Nullable": false }, { "@Name": "column3", "@Type": "Guid", "@Nullable": true } ], "@Name": "tableB" } ], "Association": { "End": [ { "@Role": "tableA", "@Type": "type1", "@Multiplicity": "*" }, { "@Role": "tableB", "@Type": "type2", "@Multiplicity": "*" } ], "@Name": "tableA_tableB" } }, { "EntityContainer": { "EntitySet": [ { "@Name": "tableA", "@EntityType": "type1" }, { "@Name": "tableB", "@EntityType": "type2" } ], "AssociationSet": { "End": [ { "@Role": "tableA", "@EntitySet": "myTableA" }, { "@Role": "tableB", "@EntitySet": "myTableB" } ], "@Name": "somerandomname" } }, "@Namespace": "myNamespace" } ], "@DataServiceVersion": 1.0 }, "@Version": 1.0 } } ];'; SELECT * FROM OPENJSON(@myJSON) WITH( [Schema] NVARCHAR(MAX) '$."root"."DataServices".Schema' AS JSON ) m;
This gives the following result:
[ { "EntityType": [ { "Property": [ { "@Name": "column1", "@Type": "Double", "@Nullable": false }, { "@Name": "column2", "@Type": "String", "@Nullable": false }, { "@Name": "column3", "@Type": "Int32", "@Nullable": true }, { "@Name": "Created", "@Type": "DateTime", "@Nullable": false }, { "@Name": "Creator", "@Type": "Guid", "@Nullable": true } ], "@Name": "tableA" }, { "Property": [ { "@Name": "column1", "@Type": "Double", "@Nullable": false }, { "@Name": "column2", "@Type": "Double", "@Nullable": false }, { "@Name": "column3", "@Type": "Guid", "@Nullable": true } ], "@Name": "tableB" } ], "Association": { "End": [ { "@Role": "tableA", "@Type": "type1", "@Multiplicity": "*" }, { "@Role": "tableB", "@Type": "type2", "@Multiplicity": "*" } ], "@Name": "tableA_tableB" } }, { "EntityContainer": { "EntitySet": [ { "@Name": "tableA", "@EntityType": "type1" }, { "@Name": "tableB", "@EntityType": "type2" } ], "AssociationSet": { "End": [ { "@Role": "tableA", "@EntitySet": "myTableA" }, { "@Role": "tableB", "@EntitySet": "myTableB" } ], "@Name": "somerandomname" } }, "@Namespace": "myNamespace" } ]
Using CROSS APPLY, we can apply the OPENJSON function again on the resulting JSON fragment. This allows us to extract the EntityType array:
SELECT * FROM OPENJSON(@myJSON) WITH( [Schema] NVARCHAR(MAX) '$."root"."DataServices".Schema' AS JSON ) m CROSS APPLY OPENJSON(m.[Schema]) WITH( EntityType NVARCHAR(MAX) '$.EntityType' AS JSON -- ,EntityContainer NVARCHAR(MAX) '$.EntityContainer' AS JSON ) s;
Two records are returned, one for the EntityType array and one for the EntityContainer (since it’s commented out, it returns NULL instead).
[ { "Property": [ { "@Name": "column1", "@Type": "Double", "@Nullable": false }, { "@Name": "column2", "@Type": "String", "@Nullable": false }, { "@Name": "column3", "@Type": "Int32", "@Nullable": true }, { "@Name": "Created", "@Type": "DateTime", "@Nullable": false }, { "@Name": "Creator", "@Type": "Guid", "@Nullable": true } ], "@Name": "tableA" }, { "Property": [ { "@Name": "column1", "@Type": "Double", "@Nullable": false }, { "@Name": "column2", "@Type": "Double", "@Nullable": false }, { "@Name": "column3", "@Type": "Guid", "@Nullable": true } ], "@Name": "tableB" } ]
Using CROSS APPLY again, we can extract the Property array:
SELECT * FROM OPENJSON(@myJSON) WITH( [Schema] NVARCHAR(MAX) '$."root"."DataServices".Schema' AS JSON ) m CROSS APPLY OPENJSON(m.[Schema]) WITH( EntityType NVARCHAR(MAX) '$.EntityType' AS JSON -- ,EntityContainer NVARCHAR(MAX) '$.EntityContainer' AS JSON ) s CROSS APPLY OPENJSON(s.EntityType) WITH( PropertyList NVARCHAR(MAX) '$.Property' AS JSON ,EntityTypeName NVARCHAR(50) '$."@Name"' ) e
Because we use CROSS APPLY, the NULL line of EntityContainer will be filtered away. We get two rows, because there are two tables in our result set (tableA and tableB).
[ { "@Name": "column1", "@Type": "Double", "@Nullable": false }, { "@Name": "column2", "@Type": "String", "@Nullable": false }, { "@Name": "column3", "@Type": "Int32", "@Nullable": true }, { "@Name": "Created", "@Type": "DateTime", "@Nullable": false }, { "@Name": "Creator", "@Type": "Guid", "@Nullable": true } ] and [ { "@Name": "column1", "@Type": "Double", "@Nullable": false }, { "@Name": "column2", "@Type": "Double", "@Nullable": false }, { "@Name": "column3", "@Type": "Guid", "@Nullable": true } ]
We need to apply OPENJSON one more time to extract the desired fields.
SELECT e.EntityTypeName ,p.PropertyName ,p.PropertyType ,p.PropertyNullable FROM OPENJSON(@myJSON) WITH( [Schema] NVARCHAR(MAX) '$."root"."DataServices".Schema' AS JSON ) m CROSS APPLY OPENJSON(m.[Schema]) WITH( EntityType NVARCHAR(MAX) '$.EntityType' AS JSON -- ,EntityContainer NVARCHAR(MAX) '$.EntityContainer' AS JSON ) s CROSS APPLY OPENJSON(s.EntityType) WITH( PropertyList NVARCHAR(MAX) '$.Property' AS JSON ,EntityTypeName NVARCHAR(50) '$."@Name"' ) e CROSS APPLY OPENJSON(e.PropertyList) WITH( PropertyName NVARCHAR(50) '$."@Name"' ,PropertyType NVARCHAR(50) '$."@Type"' ,PropertyNullable NVARCHAR(50) '$."@Nullable"' ) p;
The final result set:
I recently read the book Agile Data Warehouse Design - Collaborative Dimensional Modeling, from Whiteboard…
You can find the slides for the session Building the €100 data warehouse with the…
I was asked to do a review of the book Microsoft Power BI Performance Best…
This is a quick blog post, mainly so I have the code available if I…
Praise whatever deity you believe in, because it's finally here, a tenant switcher for Microsoft…
This book was making its rounds on social media, and the concept seems interesting enough…