Optimize for Unknown for Inline Table-Valued Functions

I had a curious performance issue today. An inline table-valued function (iTVF) was performing poorly for some parameter sets, and quite fast for other parameter values. In short, this one is fast:

SELECT * FROM dbo.myFunction(a);

This one is slow:

SELECT * FROM dbo.myFunction(b);

Turns out SQL Server used a plan with a hash join in the fast query, and a nested loop in the slow query. Due to SQL Server also wildly using incorrect estimates, the nested loops performs really poorly. Quite similar to parameter sniffing with stored procedures. Erik Darling has written a great article about it: Inline Table Valued Functions: Parameter Snorting.

The thing is, in contrast to scalar functions or multi-statement table-valued functions, the iTVF should have better performance because it will be expanded into the calling query. This way, SQL Server can use “more correct” estimates and create a plan for each different parameter. Well, today was not that day.

During testing, I took the actual query the iTVF was running and supplied the slow parameters to it, in the following format:

DECLARE @myParam INT = b; -- remember, b was the slow value
SELECT *
FROM myTable
WHERE myColumn = @myParam ;

But now, the query was actually fast! Wait, what? So the exact same query from inside the function, with the same parameter value was running faster than running the function with the parameter? How is that possible? Luckily, Brent Ozar made a quick video to explain why SQL Server can have different execution plans for both scenarios:

To summarize: executing a query with local variables forces SQL Server to not inspect the contents of the variable and use the correct estimates.
This is good:

SELECT *
FROM myTable
WHERE myColumn = b;

This is bad:

DECLARE @myParam INT = b;
SELECT *
FROM myTable
WHERE myColumn = @myParam;

But in my case, the exact opposite was true! The plan optimized for a random variable value was faster (hash join) than the plan optimized for the parameter value b (nested loops). Even SQL Server can have an off-day. And this leads us to the title of this blog post. With stored procedures, you can specify the hint OPTIMIZE FOR UNKNOWN. The concept is explained in this blog post by Kendra Little: Optimize for… Mediocre?.

Unfortunately, you cannot specify OPTIMIZE FOR UNKNOWN for an iTVF. So how do we solve this? We just call the iTVF using local variables!

DECLARE @myParam INT = b;
SELECT * FROM dbo.myFunction(@myParam);

Now, SQL Server doesn’t use the plan specific for the value b, but rather the general plan for an average value, which just happens to use the hash join. Hooray.

Keep in mind this is likely a short-term fix. As the data in the table changes, the distribution of the values might change as well and at some point in time SQL Server might – incorrectly – decide to go for the nested loops plan for the “average value”. This means I might have to refactor that iTVF into something that has more consistent performance.

p.s.: hat-tip to Brent for assisting with this issue


------------------------------------------------
Do you like this blog post? You can thank me by buying me a beer 🙂

Koen Verbeeck

Koen Verbeeck is a Microsoft Business Intelligence consultant at AE, helping clients to get insight in their data. Koen has a comprehensive knowledge of the SQL Server BI stack, with a particular love for Integration Services. He's also a speaker at various conferences.

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.