Data Modeling Best Practices for Power BI¶

Overview¶

Design star schemas around a central fact table with surrounding dimension tables.
Use robust relationships (single-direction where possible), prefer whole-number surrogate keys, and plan for scale with incremental refresh/partitioning in enterprise models.

Do’s & Don’ts¶

Do

Model as a star schema for clarity and performance.
Ensure one-to-many relationships with unique dimension keys.
Use integer surrogate keys (Primary Keys, Foreign Keys); avoid text/GUID keys for relationships. Read more here: Surrogate, Primary & Foreign Keys
Use Date dimension(s); mark it as a Date table.
Declare the grain of each fact table (e.g., "one row per order line") to avoid confusion.
Add an Unknown / Not Applicabl key (e.g., -1) in dimensions to handle orphaned fact rows.
Build dimensions wide enough with relevant attributes, but avoid unnecessary snowflaking.
Use bridge tables for many-to-many instead of direct relationships.
Enable Incremental Refresh on large tables with stable "last modified" columns for efficiency.
Optimize column storage:
Remove unused columns.
Use whole numbers where possible.
Avoid high-cardinality text fields in facts.
Periodically review models with VertiPaq Analyzer / Performance Analyzer to identify bottlenecks.
Apply Row-Level Security (RLS) on dimensions (not facts) for better performance.
Enforce schema/type checks in Power Query or upstream to ensure data quality.
Surface a last refresh timestamp and data dictionary page to improve trust.
Keep reports thin: heavy logic belongs in the dataset, not in visuals.

Grain declaration

F_Sales: One row per order line
- OrderID
- OrderLineID
- ProductKey
- CustomerKey
- SalesAmount

This makes it clear that "OrderID" alone is not unique in the fact table.

Incremental Refresh policy

Policy:
- Keep data for last 5 years
- Refresh last 7 days
- Detect changes on [ModifiedDate]

This balances query performance with manageable refresh times.

Don’t

Don’t build one giant "flat" table for everything.
Don’t snowflake dimensions unless reuse or maintenance clearly requires it.
Don’t enable bi-directional relationships by default. Bi-directional relationships can lead to ambiguity. If bi-directional relationships are required, override the filtering behaviour in your measure by using CROSSFILTER¹.
Don’t apply RLS with complex filters directly on large fact tables.
Don’t leave large descriptive text or high-cardinality columns in facts - split them into separate detail tables if needed.
Don’t hard-code data sources; use parameters for environment portability.
Don’t deploy straight to production; always validate in test first.

CROSSFILTER and USERELATIONSHIP are not supported in models using Row-Level Security (RLS). ↩