I had a curious performance issue today. An inline table-valued function (iTVF) was performing poorly for some parameter sets, and quite fast for other parameter values. In short, this one is fast:
SELECT * FROM dbo.myFunction(a);
This one is slow:
SELECT * FROM dbo.myFunction(b);
Turns out SQL Server used a plan with a hash join in the fast query, and a nested loop in the slow query. Due to SQL Server also wildly using incorrect estimates, the nested loops performs really poorly. Quite similar to parameter sniffing with stored procedures. Erik Darling has written a great article about it: Inline Table Valued Functions: Parameter Snorting.
The thing is, in contrast to scalar functions or multi-statement table-valued functions, the iTVF should have better performance because it will be expanded into the calling query. This way, SQL Server can use “more correct” estimates and create a plan for each different parameter. Well, today was not that day.
During testing, I took the actual query the iTVF was running and supplied the slow parameters to it, in the following format:
DECLARE @myParam INT = b; -- remember, b was the slow value SELECT * FROM myTable WHERE myColumn = @myParam ;
But now, the query was actually fast! Wait, what? So the exact same query from inside the function, with the same parameter value was running faster than running the function with the parameter? How is that possible? Luckily, Brent Ozar made a quick video to explain why SQL Server can have different execution plans for both scenarios:
To summarize: executing a query with local variables forces SQL Server to not inspect the contents of the variable and use the correct estimates.
This is good:
SELECT * FROM myTable WHERE myColumn = b;
This is bad:
DECLARE @myParam INT = b; SELECT * FROM myTable WHERE myColumn = @myParam;
But in my case, the exact opposite was true! The plan optimized for a random variable value was faster (hash join) than the plan optimized for the parameter value b (nested loops). Even SQL Server can have an off-day. And this leads us to the title of this blog post. With stored procedures, you can specify the hint OPTIMIZE FOR UNKNOWN. The concept is explained in this blog post by Kendra Little: Optimize for… Mediocre?.
Unfortunately, you cannot specify OPTIMIZE FOR UNKNOWN for an iTVF. So how do we solve this? We just call the iTVF using local variables!
DECLARE @myParam INT = b; SELECT * FROM dbo.myFunction(@myParam);
Now, SQL Server doesn’t use the plan specific for the value b, but rather the general plan for an average value, which just happens to use the hash join. Hooray.
Keep in mind this is likely a short-term fix. As the data in the table changes, the distribution of the values might change as well and at some point in time SQL Server might – incorrectly – decide to go for the nested loops plan for the “average value”. This means I might have to refactor that iTVF into something that has more consistent performance.
p.s.: hat-tip to Brent for assisting with this issue
I recently read the book Agile Data Warehouse Design - Collaborative Dimensional Modeling, from Whiteboard…
You can find the slides for the session Building the €100 data warehouse with the…
I was asked to do a review of the book Microsoft Power BI Performance Best…
This is a quick blog post, mainly so I have the code available if I…
Praise whatever deity you believe in, because it's finally here, a tenant switcher for Microsoft…
This book was making its rounds on social media, and the concept seems interesting enough…