Skip to content
plainsight.pro

Analytical Data Store (ADS)

Core Concept

The Analytical Data Store (ADS) is our Medallion Silver layer. It bridges Landing/Staging (Bronze) and the Gold layer (business products) by turning raw, source-aligned data into clean, integrated, and history-aware entities that dimensional models, feature stores, and downstream products can rely on.

Why "Analytical Data Store"?

We keep the semantic name ADS to make responsibilities explicit while aligning it to the Medallion Silver layer.

ADS vs Operational Data Store (ODS)

Aspect Operational Data Store (ODS) Analytical Data Store (ADS)
Purpose Near real-time operational reporting and monitoring Integrated, cleansed data for analytics and history
Update frequency High-frequency (near real-time) Scheduled batch or micro-batch
Data structure Often normalized, optimized for transactions Denormalized, optimized for analysis and integration
Usage pattern Operational queries (current state) Analytical queries (trends, history, cross-source integration)
Historical tracking Limited or none Snapshot-friendly; SCD-ready
Primary consumers Operational systems, dashboards Dimensional models, feature stores, data scientists

ADS vs Dimensional Model (Dimensions & Facts)

Aspect Analytical Data Store (ADS) Dimensional Model (Dims & Facts)
Focus Integration and data quality Business-optimized consumption
Structure Denormalized entities Star schema (Facts + Dimensions)
SCD approach Snapshot tables can track all attributes (SCD2) Dimensions selectively track attributes (Type 0/½ per need)
Granularity Entity-level (one table per entity) Business-process level (facts + supporting dimensions)
Consumers Data engineers, analytics engineers Business users, BI tools, ML models

Separation of concerns

ADS carries the heavy lifting for conformance, history, and integration. Gold-layer models focus on business meaning and performance rather than rebuilding history logic.

Purpose and Core Transformations

ADS sits between the Staging/Intermediate steps and the Dimensional Model. It feeds Gold-layer outputs (Dims/Facts, OBTs, Feature Stores) and Master Data.

  • Data quality validation: Enforce rules, quarantine errors.
  • Denormalization: Flatten normalized structures for usability.
  • Multi-source integration: Merge overlapping entities from multiple systems.
  • History readiness: Provide SCD-friendly snapshots so downstream models do not reimplement change tracking.

Table Types: Base and Snapshot

Table type Purpose Rows per entity History approach Naming convention
Base tables Current, cleansed state One current row Overwrite changes (Type 1) ADS_<Entity> (for example ADS_Customer)
Snapshot tables Historical versions Multiple rows per entity Track all attribute changes (Type 2) ADS_<Entity>_Snapshot (for example ADS_Customer_Snapshot)

Snapshot vs Dimension

ADS snapshots capture every attribute change so downstream consumers can choose how to model history. Dimension tables in Gold selectively track only the attributes that matter to the business.

Base Tables (current state)

Characteristics - One row per entity (latest state) - Denormalized structure with data quality enforced - Multi-source integration applied

Example

CREATE TABLE ADS_Customer
(
    CustomerID INT NOT NULL PRIMARY KEY,
    CustomerNumber VARCHAR(20) NOT NULL,
    CustomerName VARCHAR(100) NOT NULL,
    AddressLine1 VARCHAR(100),
    City VARCHAR(50),
    StateProvince VARCHAR(50),
    Country VARCHAR(50),
    CustomerSegment VARCHAR(20),
    CreditRating VARCHAR(10),
    T_CreatedRunId UNIQUEIDENTIFIER NOT NULL,
    T_ModifiedRunId UNIQUEIDENTIFIER NOT NULL,
    T_CreatedDateTime DATETIME NOT NULL,
    T_ModifiedDateTime DATETIME NOT NULL
);

Snapshot Tables (historical tracking)

Characteristics - Multiple rows per entity (per version) - T_ValidFromDateTime, T_ValidToDateTime, T_IsCurrent columns - Default approach: track all attribute changes with SCD Type 2 (adjust if justified)

Example

CREATE TABLE ADS_Customer_Snapshot
(
    CustomerSnapshotID INT NOT NULL PRIMARY KEY,
    CustomerID INT NOT NULL,
    CustomerNumber VARCHAR(20) NOT NULL,
    CustomerName VARCHAR(100) NOT NULL,
    AddressLine1 VARCHAR(100),
    City VARCHAR(50),
    StateProvince VARCHAR(50),
    Country VARCHAR(50),
    CustomerSegment VARCHAR(20),
    CreditRating VARCHAR(10),
    T_ValidFromDateTime DATETIME NOT NULL,
    T_ValidToDateTime DATETIME NULL,
    T_IsCurrent BIT NOT NULL,
    T_CreatedRunId UNIQUEIDENTIFIER NOT NULL,
    T_ModifiedRunId UNIQUEIDENTIFIER NOT NULL,
    T_CreatedDateTime DATETIME NOT NULL,
    T_ModifiedDateTime DATETIME NOT NULL
);

When to create snapshots

Create a snapshot only when downstream analysis needs attribute change history, compliance requires it, or dimensions will consume the history. Transaction-style tables (for example invoices) are already point-in-time and rarely need snapshots.

Key Transformations in Silver (ADS)

Data quality validation

  • Schema enforcement and required field checks
  • Referential integrity where applicable
  • Business rule validation (for example OrderDate <= ShipDate)

Progressive denormalization

Before (normalized) After (denormalized)
Customer + Address tables Customer with embedded address columns
Product + Category + Subcategory Product with category/subcategory columns
Monthly columns (Jan, Feb, Mar...) Month + Value rows (unpivoted)

Source system integration

  • Deduplicate across systems.
  • Standardize codes and naming.
  • Resolve authority (which source wins on conflicts).
  • Assign surrogate keys for downstream joins.

History tracking with snapshots

CustomerSnapshotID CustomerID Region ValidFrom ValidTo IsCurrent
1001 1 East 2024-01-01 00:00:00 2025-03-15 00:00:00 0
1002 1 West 2025-03-15 00:00:00 NULL 1

Common Use Cases

  • Multi-source customer, product, or supplier integration.
  • Feeding Master Data with clean entities.
  • Providing SCD-ready sources for dimension builds.
  • Enabling point-in-time analysis for analysts and data scientists.

Best Practices

Practice Why it matters
Document transformation rules Future maintainers need to understand ADS logic
Consistent naming ADS_<Entity> for base, ADS_<Entity>_Snapshot for history
Quality gates before entry Stop bad data before it pollutes Silver/Gold
Balance denormalization Flatten for usability without losing modeling flexibility
Selective snapshotting Only create snapshots when history is needed
Sync base and snapshot Ensure base updates create matching snapshot versions
Rebuildable from Staging Preserve reload paths from Landing/Staging for recovery