Categories: Snowflake

Cool Stuff in Snowflake – Part 12: IS DISTINCT FROM

I’m doing a little series on some of the nice features/capabilities in Snowflake (the cloud data warehouse). In each part, I’ll highlight something that I think it’s interesting enough to share. It might be some SQL function that I’d really like to be in SQL Server, it might be something else.

Often you want to check the values of a column for NULL values. Because NULL is evil, that’s way. Or you want to do a join and because NULL = NULL does not make sense, you want to make sure those rows can match as well if you have a nullable column in your key. In SQL Server, this will leave you open to the villainy of NULL if the key2 column is nullable:

FROM tableA A
JOIN tableB B ON A.key1 = B.key1
                 A.key2 = B.key2

Typically you resolve this with ISNULL:

FROM tableA A
JOIN tableB B ON A.key1 = B.key1
                 ISNULL(A.key2,'') = ISNULL(B.key2,'')

If the values are NULL, they will be replaced with an empty string and you can actually compare those with each other. But what about integers? You can use -1, but what if this value is actually present in the data? And what about dates? Furthermore, this solution might not use existing indexes because there’s now a function around key2 which prevents indexes from being used. Check out Sargable predicates. In a WHERE clause, you can fix it like this:

WHERE   A.key1 = B.key1
    AND (
            (A.key2 IS NULL AND B.key2 IS NULL)
        OR  (A.key2 = B.key2)
        )

Which of course leads to a more lengthy expression. In Snowflake, they have this interesting conditional expression: IS [NOT] DISTINCT FROM. Our join becomes:

FROM tableA A
JOIN tableB B ON A.key1 IS NOT DISTINCT FROM B.key1
             AND A.key2 IS NOT DISTINCT FROM B.key2

This single expression both checks for the equality of its members, but also checks the nullability of both columns. Awesome. A good habit would be to use IS [NOT] DISTINCT FROM instead of every = or <> in every expression (join clauses, WHERE clauses etc.) and you’ll never get burned by those pesky NULLs again!

p.s.: want this in SQL Server? Vote here!


------------------------------------------------
Do you like this blog post? You can thank me by buying me a beer 🙂
Koen Verbeeck

Koen Verbeeck is a Microsoft Business Intelligence consultant at AE, helping clients to get insight in their data. Koen has a comprehensive knowledge of the SQL Server BI stack, with a particular love for Integration Services. He's also a speaker at various conferences.

Recent Posts

Power BI PBIR Format Admin Setting

The Power BI Enhanced Report Format (PBIR) will soon become the default, and that's a…

5 days ago

Logged in as a member of an Azure AD Group Error while Deploying DACPAC

Quite a long title for a short blog post :)While deploying a DACPAC (from a…

1 week ago

Export a Power BI Report that cannot be Downloaded

Yes, you're reading that right, we're going to download a report that cannot be downloaded.…

2 weeks ago

dataMinds Connect 2025 – Slides & Scripts

You can find all the session materials for the presentation "Indexing for Dummies" that was…

2 months ago

Cloud Data Driven User Group 2025 – Slides & Scripts

The slidedeck and the SQL scripts for the session Indexing for Dummies can be found…

2 months ago

Retro Data 2025 – Slidedeck

You can find the slides of my session on the €100 DWH in Azure on…

2 months ago