CentraQL

CentraQL LoRA Fine-Tune: Adapting to Banking Language in 4 Weeks

A generic 7B model often confuses 'tahsis' with 'allocation' instead of 'underwriting'. The practical plan to teach a model the bank's vocabulary with LoRA.

BIART Ekibi3 min read2 views
LoRA fine-tune banka veri merkezi görseli

When a bank analyst asks the CentraQL Copilot for "the underwriting cycle of the last 30 days", the planner LLM must map the Turkish word *tahsis* to the cluster *underwriting + approval + disbursement* before producing the QuerySpec. A generic 7B Qwen or Llama gets it right roughly half the time; the other half it interprets *tahsis* as *allocation* and drifts off-domain.

The fix is a small LoRA fine-tune over a banking domain pack. Not a full fine-tune — a parameter-efficient adapter. Cost and time controlled.

Why LoRA?

A full 7B fine-tune needs 14 GB+ VRAM and days of training. LoRA (Low-Rank Adaptation) attaches low-rank updates to selected matrix pairs only. Result:

  • Adapter file size: ~50-200 MB (vs. 4-14 GB base model).
  • Training time: ~2-4 hours on a single RTX 4090 against Qwen2.5-7B.
  • Reversible: removing the adapter restores the original model.
  • Multi-adapter: banking-tr and banking-en can sit side by side over the same base.

Dataset structure (banking-tr example)

Three layers:

  1. Lexicon (1500 rows): domain terms and synonyms. *tahsis* → underwriting/approval/disbursement, *mevduat* → deposit, *KKB* → Credit Bureau Turkey.
  2. Question → QuerySpec (800 examples): real bank scenarios prepared as certified few-shots. Each example pairs a question with the expected QuerySpec JSON and the expected narrative.
  3. Bad examples to reject (200): prompt-injection, off-domain, unauthorised-field requests. The model is trained to respond "reject + reason".

4-week plan

Week 1 — data collection:

  • Sweep the last 6 months of IT tickets; extract 200-300 recurring analyst questions.
  • 8-10 senior analysts convert questions to clean QuerySpec.
  • Dataset version is committed and signed by the owner.

Week 2 — domain pack:

  • Build the 1500-row lexicon and the 200 rejection examples.
  • Stamp every few-shot as "certified" in PromptAuditLog format.
  • Hold out 20% as eval set.

Week 3 — training:

  • Qwen2.5-7B-Instruct + LoRA r=16, alpha=32, targets q_proj, v_proj, k_proj, o_proj.
  • 3 epochs, lr 2e-4, cosine scheduler.
  • 2-3 h on a single RTX 4090. (q4 quantization for vLLM serving adds ~30 min.)
  • Eval target: QuerySpec exact-match > 85%, narrative semantic-match > 90%.

Week 4 — pilot and calibration:

  • 2-week live pilot with 5-10 analysts.
  • Wrong QuerySpecs are folded back into the dataset; a second LoRA pass runs.
  • Production decision: end-of-pilot accuracy + analyst NPS.

Typical outcome

In a real CentraQL deployment, before/after LoRA:

  • QuerySpec exact-match: 62% → 89%.
  • Narrative semantic-match: 78% → 94%.
  • Out-of-domain rejection: 43% → 88%.
  • p95 latency unchanged (LoRA inference overhead < 5%).

Important detail: model version in PromptAuditLog

When the LoRA adapter is updated, model_version in the audit record must change. This is mandatory for re-evaluating older answers. CentraQL writes both base and adapter into every audit row, so a regulator asking "which model produced this query in October?" gets a provable answer.

On-prem and leak control

LoRA training stays inside the organisation. No training data goes to cloud services — analyst questions tend to contain real customer segments. The CentraQL training scripts never reach Anthropic Cloud; EgressGuard is hard-on.

Conclusion

LoRA is the practical form of "train your own LLM" for a bank. Four weeks, one RTX 4090 and a curated 1000-example dataset bring a generic 7B model up to bank-specific accuracy. CentraQL packages the process with dataset templates, training scripts and audit-aware adapter management.

Share