Categories: Book Review

Book Review – Agile Data Warehouse Design

I recently read the book Agile Data Warehouse Design – Collaborative Dimensional Modeling, from Whiteboard to Star Schema (quite the title) by Lawrence Corr and Jim Stagnitto. The book was recommended to my years ago by a former colleague (the book is from 2011, with the latest revision from 2014) and it has been sitting on my “to-read” for all this time. Quite frankly, I had forgotten about the book, but at a precon workshop Johnny Winter mentioned the book so I decided to go read it.

TL;DR: my only regret of when reading this book, was that I haven’t read it sooner.

The book is amazing. It’s very clearly written, and it’s goal is to help you become a better data warehouse modeler, in the context of designing a DWH using an agile methodology. The book has planted itself firmly in my top 5 of best technology books (alongside The Data Warehouse Toolkit and Star Schema The Complete Reference).

The book is divided in two parts. The first part talks about modelstorming, a brainstorming technique for data warehouse modeling that is introduced in this book, alongside the BEAM* framework. BEAM stands for Business Event Analysis & Modeling. Using 7 questions (who, what, where, how many, who and how > the 7w), you work together with IT and business stakeholders to define, analyze, model and document business processes. The result is a BEAM table (which kind of looks like a fact or dimension table but without surrogate keys). Each row contains an example of how an event (for example, a customer buys a product) looks like. Because you use examples, the data will become more clear for the business stakeholders. The BEAM table can be used as documentation, and it serves as a good foundation for designing the logical and physical layers of your data warehouse. This technique is well-suited to be used in an agile context, as you can model one type of event (which probably corresponds with one star schema) at a time and you can do this in a sprint for example.

In short, the first part of the book will help you to do better and more efficient “requirements gathering & design workshops” with your stakeholders.

The second part talks about dimensional design patters, and dives deeper on how you can design your dimensions and fact tables better (roughly one or two questions from the 7w corresponds with one chapter). Some of the content can probably found in the Kimball books as well, but it was great to have a refresher. I definitely picked up a few nice design techniques, such as the hierarchy map pattern for dealing with parent-child relationships. Quite some exotic use cases are dealt with in this part of the book, and it will make you a better data warehouse developer when you’ve finished the book.

In conclusion: if you want to be a better, more agile data warehouse developer, than I absolute recommend this book. I recommend though that you already have a couple of years behind your belt, it will make you appreciate the book better.


------------------------------------------------
Do you like this blog post? You can thank me by buying me a beer 🙂
Koen Verbeeck

Koen Verbeeck is a Microsoft Business Intelligence consultant at AE, helping clients to get insight in their data. Koen has a comprehensive knowledge of the SQL Server BI stack, with a particular love for Integration Services. He's also a speaker at various conferences.

View Comments

Recent Posts

Azure Data Factory Pipeline Debugging Fails with BadRequest

I recently had a new pipeline fail. It was actually a copy of an old…

2 weeks ago

Call a Fabric REST API from Azure Data Factory

Suppose you want to call a certain Microsoft Fabric REST API endpoint from Azure Data…

2 weeks ago

Cool Stuff in Snowflake – Part 14: Asynchronous Execution of SQL Statements

I’m doing a little series on some of the nice features/capabilities in Snowflake (the cloud data warehouse).…

1 month ago

How I passed the DP-700 Exam

I recently took and passed the DP-700 exam, which is required for the Microsoft Certified:…

1 month ago

Take over Ownership in Microsoft Fabric

When you create an item in Microsoft Fabric (a notebook, a lakehouse, a warehouse, a…

3 months ago

Cloudbrew 2024 – Slides

You can find the slides for the session Building the €100 data warehouse with the…

4 months ago