Categories: Varia

Cool Stuff in Snowflake – Part 13: Row-Pattern Recognition

I’m doing a little series on some of the nice features/capabilities in Snowflake (the cloud data warehouse). In each part, I’ll highlight something that I think it’s interesting enough to share. It might be some SQL function that I’d really like to be in SQL Server, it might be something else.

In the book T-SQL Window Functions – For data analysis and beyond, Itzik Ben-Gan explains the concept of row-pattern recognition (RPR) in a dedicated chapter (you can find a full book review here). It’s a concept that doesn’t exist in T-SQL, but is described in the SQL standard and is available in some other database systems. Snowflake has recently introduced support for RPR. Itzik explains there are two main features defined:

  • R010: RPR in the FROM clause
  • R020: RPR in the WINDOW clause

Oracle has implemented R010 and thus Itzik uses this RDBMS to illustrate the examples. Snowflake has also implemented R010. It seems there are no database vendors yet that have R020.

But what is RPR? Using this feature, you can return rows that match a certain pre-defined pattern. For example stock prices following a V or a W shape or a user following a specific sequence of pages on your website. If you’re interested in more details, I can definitely recommend you the T-SQL Window Functions book, or you can go through the Snowflake documentation.

Let’s illustrate the concept with a query that finds a V-shaped pattern in stock data. Here’s some sample data:

CREATE OR REPLACE TABLE dbo.StockPrice(CompanyName VARCHAR(50), PriceDate date, Price int);

INSERT INTO dbo.StockPrice(CompanyName, PriceDate, Price)
VALUES
    ('A', '2021-08-01', 25),
    ('A', '2021-08-02', 31),
    ('A', '2021-08-03', 36),
    ('A', '2021-08-04', 37),
    ('A', '2021-08-05', 35),
    ('A', '2021-08-06', 32),
    ('A', '2021-08-07', 29),
    ('A', '2021-08-08', 22),
    ('A', '2021-08-09', 17),
    ('A', '2021-08-10', 15),
    ('A', '2021-08-11', 18),
    ('A', '2021-08-12', 26),
    ('A', '2021-08-13', 29),
    ('A', '2021-08-14', 33),
    ('A', '2021-08-15', 39),
    ('A', '2021-08-16', 42),
    ('A', '2021-08-17', 45),
    ('A', '2021-08-18', 44),
    ('A', '2021-08-19', 41),
    ('A', '2021-08-20', 40);

On a line chart, we get the following:

The V-pattern starts on August the 4nd and ends on August the 17th. In Snowflake, we can use the following SQL query to retrieve the matching pattern:

SELECT *
FROM dbo.StockPrice
MATCH_RECOGNIZE
(
    PARTITION BY CompanyName
    ORDER BY PriceDate
    MEASURES
         match_number()     AS match_number
        ,first(PriceDate)   AS start_date
        ,last(PriceDate)    AS end_date
        ,count(*)           AS rows_in_sequence
        ,count(B.*)         AS num_decreases
        ,count(C.*)         AS num_increases
        ,max(Price)         AS max_price
        ,min(Price)         AS min_price
    ONE ROW PER MATCH
    AFTER MATCH SKIP TO LAST C
    PATTERN(A B+ C+)
    DEFINE
         B as price < lag(Price)
        ,C as price > lag(Price)
)
ORDER BY CompanyName, match_number;

For an explanation of all the subclauses, you can find them in the documentation. As you can see, RPR can be really powerful and it can be seen as the next step in the evolution of window functions. If you would like them in T-SQL as well, you can vote here.


------------------------------------------------
Do you like this blog post? You can thank me by buying me a beer 🙂
Koen Verbeeck

Koen Verbeeck is a Microsoft Business Intelligence consultant at AE, helping clients to get insight in their data. Koen has a comprehensive knowledge of the SQL Server BI stack, with a particular love for Integration Services. He's also a speaker at various conferences.

Recent Posts

Free webinar – Tackling the Gaps and Islands Problem with T-SQL Window Functions

I'm hosting a free webinar at MSSQLTips.com at the 19th of December 2024, 6PM UTC.…

5 days ago

dataMinds Connect 2024 – Session Materials

The slides and scripts for my session "Tackling the Gaps & Islands Problem with T-SQL…

4 weeks ago

Connect to Power BI as a Guest User in another Tenant

Sometimes your Microsoft Entra ID account (formerly known as Azure Active Directory) is added as…

2 months ago

How to use a Script Activity in ADF as a Lookup

In Azure Data Factory (ADF, but also Synapse Pipelines and Fabric Pipelines), you have a…

4 months ago

Database Build Error – Incorrect syntax near DISTINCT

I wrote a piece of SQL that had some new T-SQL syntax in it: IS…

4 months ago

Speaking at dataMinds Connect 2024

I'm very excited to announce I've been selected as a speaker for dataMinds Connect 2024,…

5 months ago