Data Management

A Data Quality Framework: Six Dimensions, Measurable Metrics and Operational Practice

Data quality is not an abstract goal — it is governed across six measurable dimensions: accuracy, completeness, consistency, timeliness, uniqueness, validity. Concrete metric formulas and the tooling around them.

BIART Ekibi2 min read9 views
Veri kalitesi metrikleri ve dashboard görseli

“Our data is dirty” is heard in every CDO meeting. The problem is that “dirty” is an abstract verdict. To actually manage data quality you have to pass through a six-dimension framework with measurable metrics: accuracy, completeness, consistency, timeliness, uniqueness, validity. Each has a concrete formula and an operational form you can audit every day.

Six dimensions, six metrics

  • Accuracy: records match the real world. Metric: percentage of records that match a verifiable source (e.g. national ID service, IBAN validator).
  • Completeness: required fields are populated. Metric: not-null rate plus business-conditional completeness (“corporate customer must have a tax number”).
  • Consistency: the same entity looks the same across systems. Metric: percentage of mismatched rows across source systems (e.g. address differences between CRM and core banking).
  • Timeliness: data is available within the expected freshness. Metric: source-to-analytics lag in minutes (p95) against the SLA.
  • Uniqueness: an entity is not duplicated. Metric: deterministic (key) and probabilistic (entity resolution) duplication rate.
  • Validity: values conform to type/format/range rules. Metric: count of schema/regex/range violations.

Automation tooling

dbt’s native tests (unique, not_null, accepted_values, relationships) open the first door; complex business rules go into custom singular tests. Great Expectations or Soda Core are ideal for flows independent of dbt (for example, before raw data lands in Snowflake). dbt + Soda lets you place checks at every point of the transformation pipeline.

The Data Contract approach

A paradigm that matured in 2026: a signed agreement between the data producer and consumer. The producer commits via a testable contract not to break consumers when the schema changes. Open-source implementations have matured; Schemata and the Datacontract.com templates lead in practical adoption.

Production SLA

Putting numbers on a dashboard is not enough. Each dimension needs the triplet threshold + alarm + owner: if accuracy drops below 95%, which team responds within what time, who escalates. The SLO mindset has reached data teams too; reliability engineering is now a real discipline on the data side.

Share