CentraQL

CentraQL KPI Anomaly Detection: A Threshold + Z-Score Hybrid

Watching 200 banking KPIs purely with rules — or purely with ML — never holds. The CentraQL hybrid approach.

BIART Ekibi3 min read1 views
KPI anomali grafiği ve eşik bantları

A typical mid-to-large bank has 150-250 KPIs to watch: call-centre response time, fraud notification rate, card activation speed, credit approval latency, mobile-app error rate, ATM cash-fill ratio, and on. Every one of those KPIs needs a "down", "spike" or "drift" alert delivered to the right person at the right time. Wrong alerts exhaust the team; late alerts cost money.

Why the two pure approaches fail

Pure threshold: alert when KPI < X or > Y. Fast, explainable; but blind to seasonality and trend. The pre-holiday card-usage spike fires "anomaly" every year at the same hour.

Pure ML: time-series anomaly detection (Prophet, isolation forest, LSTM). Captures seasonality but is hard to explain, suffers a cold-start on new KPIs, and the model itself needs maintenance.

CentraQL combines both.

The hybrid model

for kpi in active_kpis: val = current_value(kpi) if hard_threshold_violated(kpi, val): emit(severity="critical", reason="hard threshold") continue z = rolling_z_score(kpi, window=28d) if abs(z) > kpi.z_threshold: baseline_pred = seasonal_baseline(kpi) if outside_band(val, baseline_pred, kpi.band): emit(severity="warning", reason="z-score+seasonal")

Three layers: hard threshold, z-score, seasonal baseline.

1. Hard threshold (rule)

Contractual limits live here. Examples: credit-approval p95 latency > 5 s; hourly fraud-notification rate > 0.8%. The analyst writes the rule, the owner approves, an SLA is attached.

2. Z-score (statistical)

Mean and std are computed over a 28-day rolling window. If |z| of the current value exceeds the threshold (typically 3.0), a signal fires. Z-score is explainable: "3.4 sigma away from the 28-day mean" is a sentence every CFO understands.

3. Seasonal baseline (CentraQL specific)

Z-score alone misses holidays, weekends and the intraday rhythm. The seasonal baseline produces an 8-week mean band for the same hour-of-week. A signal fires only when z-score AND seasonal are both off; values that pass either test alone do not raise alarms. This typically cuts false positives by 3-5×.

Threshold configuration

Thresholds are layered — KPI-specific, profile-specific, domain-pack-specific overrides:

  • System default: |z| > 3.0
  • KPI fraud_rate_hourly override: |z| > 2.5 (more sensitive)
  • ComplianceProfile RegulatedFinance tightens: |z| > 2.0 + seasonal AND

This turns threshold debates into policy and avoids fighting per-KPI.

Anomaly explanation

When CentraQL fires, the Copilot pipeline produces the explanation via the narrator LLM: "At 14:00 the ATM cash-fill ratio was 38%; the last 8 weeks at this hour averaged 52% ± 4. Signal z=-3.6, outside the seasonal band. Last Tuesday at 14:00 the value was 50%." The explanation is written to PromptAuditLog; the team does not have to chase the cause of the alert.

Operational result

In a bank pilot covering 180 KPIs over a month:

  • Pure threshold: 240 alerts, 72% false positives.
  • Pure z-score: 380 alerts, 58% false positives.
  • CentraQL hybrid: 110 alerts, 18% false positives.

Fewer false positives = less alert fatigue = ~4× faster reaction time on real incidents.

Conclusion

KPI anomaly detection is neither a pure-rules problem nor a pure-ML problem. Hard thresholds enforce the contract, z-score catches deviation, the seasonal baseline catches rhythm. CentraQL fuses the three, has the Copilot narrate every alert, and writes the result to audit. To start with the 200-KPI watchlist of a bank: ~1 day to add the domain pack, ~3 days of owner tuning, then automatic.

Share