I’m doing a little series on some of the nice features/capabilities in Snowflake (the cloud data warehouse). In each part, I’ll highlight something that I think it’s interesting enough to share. It might be some SQL function that I’d really like to be in SQL Server, it might be something else.
In this post I’ll talk about how you can easily re-use aliased expressions in Snowflake. For example, you can do something like this (using the StackOverflow sample database):
That’s right. I defined an expression – a concatenation of two fields – and used the alias of that expression in another expression. Furthermore, I used the alias of the second expression in the WHERE clause. Gasp.
In Snowflake, you can re-use your aliases in other expressions and in the WHERE, GROUP BY and HAVING clause. Awesome. In SQL Server, you will be greeted with the following error:
Boo. I submitted an idea on the feedback site. You can still vote for it. It has over 100 votes and it is under review. The main criticism is: but what if the alias is the same as one of the columns of the original table? Let’s test this out:
It seems the original column name has precedence over any alias defined. Unfortunately, it’s not all cookies and rainbows. You can’t use this technique with window functions. Suppose we have the following sample data:
If we want to filter out duplicate rows by keeping the oldest record (thus removing the record for Bob with ID = 4), we can use the ROW_NUMBER function to achieve this. In SQL Server, the query would look like this:
You might be tempted to use the aliasing technique in Snowflake to shorten your code. However, this errors out:
Basically, when you use the aliasing technique, the Snowflake compiler will just copy paste the aliased expression wherever referenced. But window functions can only appear in SELECT or ORDER BY clauses, so they will error out if used anywhere else. It seems we will have to keep using the common table expressions for readable code.
In conclusion, you can re-use aliased expressions directly in other clauses, which is really useful if you have a lot of expressions that build upon other expressions. It avoids the use of subqueries. But you can’t use it for window functions.
Other parts in this series:
I recently read the book Agile Data Warehouse Design - Collaborative Dimensional Modeling, from Whiteboard…
You can find the slides for the session Building the €100 data warehouse with the…
I was asked to do a review of the book Microsoft Power BI Performance Best…
This is a quick blog post, mainly so I have the code available if I…
Praise whatever deity you believe in, because it's finally here, a tenant switcher for Microsoft…
This book was making its rounds on social media, and the concept seems interesting enough…
View Comments
You also need to declare the aliased column before anything that uses it.
-- This works
SELECT
'Hello' AS Word
,Word || ', world!' AS Message;
-- This doesn't
SELECT
Word || ', world!' AS Message
,'Hello' AS Word;
Netezza has the same functionality, and as a lot of the SQL I need to refactor was written for Netezza, knowing that Snowflake has the same functionality means I have one less thing to worry about. Woohoo!
By the way, Snowflake is FAR superior to Netezza.
Thanks Andrew
For the windows function, they have since added a QUALIFY syntax that allows you to do this.
See. https://docs.snowflake.com/en/sql-reference/constructs/qualify.html
That's very interesting, thanks for the pointer!
Thanks KOEN . Very Useful
Thanks for posting this, Davis. Not having to use sub-queries to filter on row_number values will mean more understandable code, and that's always a good thing.
Sorry, "Davis" was a typo. Thank you, DAVID. :)
that's cool , Thanks David.
Thanks Koen. Great article.
As I've had to help others debug their code who have used thier functionality its worth a warning that the precedence of real column names works if you are joining to another table with a column of the same name. Imagine a scenario when you have a column called FirstChar in a table you are left outer joining too (i.e. the right sided table). The query will be filtered by the values in the tabel on the right rather than by the expression you have aliased.
So although its a great piece of functionality it can lead to apparently strange results if you are not aware of the behaviour.
Cheers
Mike
Yes, this can be a real danger. Once you notice you're dealing with duplicate column names, it's time to think about new alias names.
Just be aware that the alias is for the function not for the value of the evaluated function. Ie. if you use a function that doesn't evaluate to the same value everytime, like random, you might be in for a surprice.