Star Schema vs Snowflake: Dimensional Modelling

Since Ralph Kimball's "The Data Warehouse Toolkit" was published in 1996, the star schema has been the gold standard for analytical data models. With the rise of modern cloud platforms, however, the snowflake schema — and even more radical shapes — are back on the agenda. What should the decision actually rest on?

Star Schema

A central fact table (e.g. sales_fact) surrounded by denormalised dimension tables (dim_customer, dim_product, dim_date). Each dimension lives in a single table; JOINs are kept to a minimum. The upside: simplicity, performance, and the natural shape BI tools expect.

sales_fact ←→ dim_customer ←→ dim_product ←→ dim_date ←→ dim_store

Snowflake Schema

Dimensions are normalised — dim_product → dim_category → dim_department and similar relationships. Upside: less storage, easier maintenance. Downside: more JOINs, more complex queries, heavier BI semantic modelling.

Modern Cloud Perspective

In the era of columnar storage (Snowflake, BigQuery, Synapse), the size of a denormalised table is no longer as important as it once was. A 1 TB dim_customer can compress down to 50 GB with columnar compression, which seriously weakens the "snowflake saves space" argument.

On the other hand, platforms like GCP BigQuery and AWS Athena still price JOINs aggressively; on those platforms star schema tends to be the more economical choice.

A Practical Guide

BI-first DWH (Power BI, Tableau, Qlik): star schema — native integration, consistent performance.
Data-science + analytics hybrid: star, but with small linked tables for the hierarchies inside dimensions (a light snowflake).
Highly dynamic product catalogue (retail, e-commerce): a denormalised dim_product incurs a performance penalty on daily rebuilds; either go incremental or snowflake.
Regulatory reports: snapshot facts plus time-dated SCD Type 2 dimensions (on top of either model).

Fields That Should Be Denormalised

Date hierarchies (year/quarter/month/day): always denormalised in dim_date.
Customer demographics: usually flattened into dim_customer.
Product category paths: if there are four or more levels, snowflake may win.

The One Big Table (OBT) Approach

Modern ML pipelines and some self-service BI scenarios prefer a single wide table to a dimensional model. No JOINs, full context on a single row. The downside is steep maintenance cost; any change ripples through the whole table. Our recommendation: use OBT only for ML feature stores — for general BI, dimensional modelling still wins.

Takeaway

Power BI Data Warehouse Snowflake SQL Server

Star Schema or Snowflake? The Dimensional Modelling Decision

Star Schema

Snowflake Schema

Modern Cloud Perspective

A Practical Guide

Fields That Should Be Denormalised

The One Big Table (OBT) Approach

Takeaway

Self-Service Analytics That Actually Scales

Big Data Architecture 2026: Lakehouse, Streaming, Vector

BI Transformation: From Reports to a Decision Platform

Star Schema or Snowflake? The Dimensional Modelling Decision

Star Schema

Snowflake Schema

Modern Cloud Perspective

A Practical Guide

Fields That Should Be Denormalised

The One Big Table (OBT) Approach

Takeaway

Related posts

Self-Service Analytics That Actually Scales

Big Data Architecture 2026: Lakehouse, Streaming, Vector

BI Transformation: From Reports to a Decision Platform