Technical Deep Dive

From Natural Language to Verified SQL: The Role of the Semantic Layer

BI Copilots succeed not because of the LLM brand but because of a mature semantic layer. How to build a real natural-language analytics architecture.

BIART Ekibi3 min read1 views
Semantic layer ve BI Copilot mimarisi görseli

Most BI Copilots that shipped between 2024 and 2025 hit the same wall: a question like "top-5 most profitable branches last month" gets translated by the LLM into syntactically correct but semantically wrong SQL. "Branch profitability" is interpreted by the LLM rather than by the organisation’s own definitions. The fix is not a different LLM brand — it is a semantic layer sitting in between.

What is a semantic layer?

The semantic layer translates between the raw shape of tables and columns and the language used by the business. Three core blocks:

  • EntityMap: which table represents which business entity (e.g. dbo.customers → "Customer").
  • MetricDefinition: pins one formula to one KPI (e.g. "Branch profitability = SUM(profit_amount) WHERE branch_id = X").
  • BusinessTerm: the corporate glossary (e.g. "PEP", "AML threshold", "VIP segment").

Around these sit ColumnPolicy (visible / masked columns) and certified few-shot examples.

The NL → SQL flow

A BI Copilot built on a semantic layer typically runs this flow:

  1. Intent: data query, KPI explanation, glossary lookup?
  2. Retrieve: vector search pulls the relevant EntityMap, MetricDefinition and prior examples (few-shots).
  3. Plan: the LLM emits a QuerySpec JSON — tables, columns, filters, groupings, ordering. Not SQL.
  4. Validate: QuerySpec is checked against the semantic layer and the catalog — allow-list of columns, correct join paths, PII control.
  5. Synthesize: QuerySpec compiles into parameterised, read-only SQL.
  6. Execute: runs in a sandbox under a separate read-only DB account.
  7. Mask: ColumnPolicy applies role-aware masking.
  8. Narrate: a narrator LLM summarises the result with number, currency and total sanitization.

Why direct SQL generation fails

LLMs produce syntactically correct SQL easily — but they do not know which table holds "branch", how "profitability" is computed, or which column is PII. Direct SQL emission produces three classes of errors:

  • Wrong JOIN: two similarly-named tables, the wrong foreign key.
  • Wrong metric: using net_amount instead of profit_amount for "profitability".
  • PII leak: including the customer national-ID column in the result.

The QuerySpec intermediate layer puts a structural gate against these. The LLM says what it wants; the synthesizer decides how SQL is written.

Certified few-shots

Few-shot examples behave as the semantic layer’s live memory. A good banking starter set:

  • "Top-5 most profitable branches last month" → QuerySpec → SQL.
  • "Personal-loan approval rate last week" → QuerySpec → SQL.
  • "PEP-segment portfolio volume YTD" → QuerySpec → SQL.

These are kept per language (TR / EN / AZ / RU); the LLM gets few-shots in the asked language so language drift (TR question, EN answer) is prevented.

Banking domain pack

For questions like "in which branches do VIP customers concentrate" to be answered correctly, the institution needs its own term set. A domain pack contains: the BusinessTerm glossary, MetricDefinition set, certified few-shots. Banking-tr and banking-en packs share a common base; each bank enriches it with its own terminology.

Operational discipline

The NL → SQL pipeline becomes operational only with three control points:

  • Suggestion inbox: when the LLM proposes a new QuerySpec or metric, admin approves.
  • PromptAuditLog: every stage written to audit; 90-day under KVKK or 7-year under RegulatedFinance.
  • Confidence threshold: low-confidence questions are auto-routed to a human analyst.

Closing

A BI Copilot succeeds because of a good semantic layer, not because of a fancy LLM. Spend 70% of the investment on table and metric definitions, 20% on few-shots, 10% on model choice. With that ratio in place, natural-language questioning becomes a durable analytics surface that finally removes the IT bottleneck.

Share