Reblog: Star Schema The Complete Reference – Review

SQLKover update: 1.5 year later, I still stand by my position: Star Schema The Complete Reference is the best book on dimensional modelling around. Recommended/obligatory reading!

I finished my first book of this year and here’s the review! After I finished the newest edition of Ralph Kimball’s Data Warehouse Toolkit, a senior colleague of mine recommended the book Star Schema The Complete Reference by Christopher Adamson. To be honest, I had never heard about it. But was I wrong: this book is an absolute must read for every data warehouse professional who takes himself/herself seriously. The book is extremely well written and it explains the concepts of star schemas (hence the title of course) very thoroughly with great detailed examples. I would even go as far by proclaiming that I like this book better than the “Toolkit”. If I had to choose only one book to take with me on new projects, it would be this one.

One of the strengths of this book is that it is vendor independent and also architecture independent. The book begins by introducing how dimensional modelling is used in the methodologies used by Inmon and Kimball and it debunks quite a few myths that exist about both methodologies. It also explains the concept of stand-alone data marts and which advantages and especially which disadvantages this technique has. Throughout the book, the author explains how the different methodologies can have an impact on how certain concepts are implemented.

Another great piece was the explanation of drilling-across and the importance of it for cross analysis of different process. While reading Kimball’s book, I didn’t pick up the significance of this, but because of the attention Christopher gives to conformed dimensions – a whole chapter – I could really put it in perspective. The different types of conformity are well explained and are illustrated with clear examples. The book also gives different ways of drilling across and provides a few code samples.

There are the usual chapters that you expect from a book like this: dimension design, slowly changing attributes, snowflaking, different kinds of fact tables et cetera. But there is the occasional deep dive: multi-valued dimensions/ attributes and how they relate to bridge tables, type-specific stars, recursive hierarchies and so on. He even explains how to handle all these exotic features when there are slowly changing attributes, but unfortunately he doesn’t back it up with detailed examples. It’s the only weak point of the book, but making the book too detailed would perhaps make it too long and harder to read. And hopefully we never have to implement a slowly changing many to many relationship.

There are also some chapters on performance enhancements, such as derived tables and aggregates. These solutions might become somewhat obsolete in the past years thanks to hardware advancements and the introduction of columnar and in-memory databases, but since not every company has Enterprise edition these tools might still be very useful.

The last three chapters are more about how to design the ETL, how to work with BI tools and how to document models. Personally I found those less interesting, but there’s still some value in it.

Conclusion

It’s a great book about designing data warehouses using star schemas. Buy it, read it and take it with you on every data warehouse project you encounter. You won’t regret it.


------------------------------------------------
Do you like this blog post? You can thank me by buying me a beer 🙂
Koen Verbeeck

Koen Verbeeck is a Microsoft Business Intelligence consultant at AE, helping clients to get insight in their data. Koen has a comprehensive knowledge of the SQL Server BI stack, with a particular love for Integration Services. He's also a speaker at various conferences.

View Comments

Recent Posts

Free webinar – Tackling the Gaps and Islands Problem with T-SQL Window Functions

I'm hosting a free webinar at MSSQLTips.com at the 19th of December 2024, 6PM UTC.…

6 days ago

dataMinds Connect 2024 – Session Materials

The slides and scripts for my session "Tackling the Gaps & Islands Problem with T-SQL…

4 weeks ago

Connect to Power BI as a Guest User in another Tenant

Sometimes your Microsoft Entra ID account (formerly known as Azure Active Directory) is added as…

2 months ago

How to use a Script Activity in ADF as a Lookup

In Azure Data Factory (ADF, but also Synapse Pipelines and Fabric Pipelines), you have a…

4 months ago

Database Build Error – Incorrect syntax near DISTINCT

I wrote a piece of SQL that had some new T-SQL syntax in it: IS…

4 months ago

Speaking at dataMinds Connect 2024

I'm very excited to announce I've been selected as a speaker for dataMinds Connect 2024,…

5 months ago