I had a curious performance issue today. An inline table-valued function (iTVF) was performing poorly for some parameter sets, and quite fast for other parameter values. In short, this one is fast:
SELECT * FROM dbo.myFunction(a);
This one is slow:
SELECT * FROM dbo.myFunction(b);
Turns out SQL Server used a plan with a hash join in the fast query, and a nested loop in the slow query. Due to SQL Server also wildly using incorrect estimates, the nested loops performs really poorly. Quite similar to parameter sniffing with stored procedures. Erik Darling has written a great article about it: Inline Table Valued Functions: Parameter Snorting.
The thing is, in contrast to scalar functions or multi-statement table-valued functions, the iTVF should have better performance because it will be expanded into the calling query. This way, SQL Server can use “more correct” estimates and create a plan for each different parameter. Well, today was not that day.
During testing, I took the actual query the iTVF was running and supplied the slow parameters to it, in the following format:
DECLARE @myParam INT = b; -- remember, b was the slow value SELECT * FROM myTable WHERE myColumn = @myParam ;
But now, the query was actually fast! Wait, what? So the exact same query from inside the function, with the same parameter value was running faster than running the function with the parameter? How is that possible? Luckily, Brent Ozar made a quick video to explain why SQL Server can have different execution plans for both scenarios:
To summarize: executing a query with local variables forces SQL Server to not inspect the contents of the variable and use the correct estimates.
This is good:
SELECT * FROM myTable WHERE myColumn = b;
This is bad:
DECLARE @myParam INT = b; SELECT * FROM myTable WHERE myColumn = @myParam;
But in my case, the exact opposite was true! The plan optimized for a random variable value was faster (hash join) than the plan optimized for the parameter value b (nested loops). Even SQL Server can have an off-day. And this leads us to the title of this blog post. With stored procedures, you can specify the hint OPTIMIZE FOR UNKNOWN. The concept is explained in this blog post by Kendra Little: Optimize for… Mediocre?.
Unfortunately, you cannot specify OPTIMIZE FOR UNKNOWN for an iTVF. So how do we solve this? We just call the iTVF using local variables!
DECLARE @myParam INT = b; SELECT * FROM dbo.myFunction(@myParam);
Now, SQL Server doesn’t use the plan specific for the value b, but rather the general plan for an average value, which just happens to use the hash join. Hooray.
Keep in mind this is likely a short-term fix. As the data in the table changes, the distribution of the values might change as well and at some point in time SQL Server might – incorrectly – decide to go for the nested loops plan for the “average value”. This means I might have to refactor that iTVF into something that has more consistent performance.
p.s.: hat-tip to Brent for assisting with this issue
I'm hosting a free webinar at MSSQLTips.com at the 19th of December 2024, 6PM UTC.…
The slides and scripts for my session "Tackling the Gaps & Islands Problem with T-SQL…
Sometimes your Microsoft Entra ID account (formerly known as Azure Active Directory) is added as…
In Azure Data Factory (ADF, but also Synapse Pipelines and Fabric Pipelines), you have a…
I wrote a piece of SQL that had some new T-SQL syntax in it: IS…
I'm very excited to announce I've been selected as a speaker for dataMinds Connect 2024,…