Categories: Varia

Cool Stuff in Snowflake – Part 4: Aliasing all the things

I’m doing a little series on some of the nice features/capabilities in Snowflake (the cloud data warehouse). In each part, I’ll highlight something that I think it’s interesting enough to share. It might be some SQL function that I’d really like to be in SQL Server, it might be something else.

In this post I’ll talk about how you can easily re-use aliased expressions in Snowflake. For example, you can do something like this (using the StackOverflow sample database):

That’s right. I defined an expression – a concatenation of two fields – and used the alias of that expression in another expression. Furthermore, I used the alias of the second expression in the WHERE clause. Gasp.

In Snowflake, you can re-use your aliases in other expressions and in the WHERE, GROUP BY and HAVING clause. Awesome. In SQL Server, you will be greeted with the following error:

Boo. I submitted an idea on the feedback site. You can still vote for it. It has over 100 votes and it is under review. The main criticism is: but what if the alias is the same as one of the columns of the original table? Let’s test this out:

It seems the original column name has precedence over any alias defined. Unfortunately, it’s not all cookies and rainbows. You can’t use this technique with window functions. Suppose we have the following sample data:

If we want to filter out duplicate rows by keeping the oldest record (thus removing the record for Bob with ID = 4), we can use the ROW_NUMBER function to achieve this. In SQL Server, the query would look like this:

You might be tempted to use the aliasing technique in Snowflake to shorten your code. However, this errors out:

Basically, when you use the aliasing technique, the Snowflake compiler will just copy paste the aliased expression wherever referenced. But window functions can only appear in SELECT or ORDER BY clauses, so they will error out if used anywhere else. It seems we will have to keep using the common table expressions for readable code.

In conclusion, you can re-use aliased expressions directly in other clauses, which is really useful if you have a lot of expressions that build upon other expressions. It avoids the use of subqueries. But you can’t use it for window functions.

Other parts in this series:


------------------------------------------------
Do you like this blog post? You can thank me by buying me a beer 🙂
Koen Verbeeck

Koen Verbeeck is a Microsoft Business Intelligence consultant at AE, helping clients to get insight in their data. Koen has a comprehensive knowledge of the SQL Server BI stack, with a particular love for Integration Services. He's also a speaker at various conferences.

View Comments

  • You also need to declare the aliased column before anything that uses it.

    -- This works
    SELECT
    'Hello' AS Word
    ,Word || ', world!' AS Message;

    -- This doesn't
    SELECT
    Word || ', world!' AS Message
    ,'Hello' AS Word;

    Netezza has the same functionality, and as a lot of the SQL I need to refactor was written for Netezza, knowing that Snowflake has the same functionality means I have one less thing to worry about. Woohoo!

    By the way, Snowflake is FAR superior to Netezza.

  • Thanks Koen. Great article.

    As I've had to help others debug their code who have used thier functionality its worth a warning that the precedence of real column names works if you are joining to another table with a column of the same name. Imagine a scenario when you have a column called FirstChar in a table you are left outer joining too (i.e. the right sided table). The query will be filtered by the values in the tabel on the right rather than by the expression you have aliased.

    So although its a great piece of functionality it can lead to apparently strange results if you are not aware of the behaviour.

    Cheers

    Mike

    • Yes, this can be a real danger. Once you notice you're dealing with duplicate column names, it's time to think about new alias names.

  • Just be aware that the alias is for the function not for the value of the evaluated function. Ie. if you use a function that doesn't evaluate to the same value everytime, like random, you might be in for a surprice.

Recent Posts

Book Review – Agile Data Warehouse Design

I recently read the book Agile Data Warehouse Design - Collaborative Dimensional Modeling, from Whiteboard…

1 month ago

Cloudbrew 2024 – Slides

You can find the slides for the session Building the €100 data warehouse with the…

1 month ago

Book Review – Microsoft Power BI Performance Best Practices

I was asked to do a review of the book Microsoft Power BI Performance Best…

2 months ago

Create a Numbers Table in Power Query

This is a quick blog post, mainly so I have the code available if I…

2 months ago

Microsoft finally adds Tenant Switcher for Fabric / Power BI

Praise whatever deity you believe in, because it's finally here, a tenant switcher for Microsoft…

2 months ago

Book Review – Humanizing Data Strategy by Tiankai Feng

This book was making its rounds on social media, and the concept seems interesting enough…

2 months ago