What · How · Why · Where · Importance
What
Banking compliance is adherence to laws, regulations, and standards set by regulators (RBI, FED, ECB, BIS) to protect customers, ensure financial stability, and prevent financial crime.
How
Through policies, controls, automated monitoring, audit trails, data lineage, and regular reporting to regulators. Technology layers (RBAC, masking, retention) enforce controls programmatically.
Why
Non-compliance results in massive fines (GDPR: 4% of global revenue), license revocation, reputational damage, and systemic risk. The 2008 crisis accelerated regulation due to data opacity.
Where
Every layer: customer onboarding (KYC/AML), transaction processing (fraud detection), risk reporting (Basel), data storage (GDPR/DPDP), and AI use (model risk management).
Importance
Compliance is not optional — it is the license to operate. Banks spend 10–15% of revenue on compliance. Good data architecture makes compliance cheaper and more accurate.
The Regulatory Framework
Major regulations that directly drive data architecture requirements in banking.
🏴 GDPR / DPDP Act
What: EU/India data privacy law. Requires: Purpose limitation, data minimisation, right to erasure, consent tracking. Data Impact: Every PII field must be tagged, masked for non-authorised users, and deleted on request. Lineage proves erasure was complete.
🏰 Basel III / IV
What: BIS capital adequacy framework. Requires: Accurate credit/market/operational risk data aggregated group-wide. Data Impact: Unified data model for exposure calculations. BCBS 239 mandates data lineage from source to risk report.
🕵 AML / CFT
What: Anti-Money Laundering / Counter Financing of Terrorism. Requires: Transaction monitoring, suspicious activity reporting (SAR), correspondent banking due diligence. Data Impact: Real-time transaction graph analysis; entity resolution across accounts.
👥 KYC — Know Your Customer
What: Customer identity verification at onboarding and ongoing. Requires: Document verification, PEP/sanctions screening, risk classification. Data Impact: Customer master data must be linked across all accounts; golden record management.
📈 IFRS 9
What: International accounting standard for financial instruments. Requires: Expected Credit Loss (ECL) provisioning based on forward-looking models. Data Impact: Historical loan performance data, PD/LGD/EAD models, audit trail of model assumptions.
🆕 SR 11-7 / Model Risk Management
What: US Fed guidance on AI/ML model governance. Requires: Model inventory, validation, performance monitoring, explainability. Data Impact: Training data lineage, model output documentation, drift detection pipelines.
🔓 How Data Architecture Enables Compliance
- Data LineageProves to auditors exactly where every number in a regulatory report came from, end-to-end.
- Data MaskingDynamically masks PII (Aadhaar, PAN, account number) for users without need-to-know — GDPR/DPDP enforcement at the data layer.
- Retention PoliciesAutomated deletion or archival when data reaches its retention limit — right to erasure, audit log pruning.
- Audit TrailsImmutable log of who accessed what data, when, and why. Required for GDPR breach investigation and AML SARs.
- Data ContractsSchema and quality guarantees that ensure regulatory reports are built on validated, fresh data.
- OntologiesFormally defined relationships (Customer → Account → Transaction) that power AML graph analytics and KYC entity resolution.
Non-Performing Assets & Portfolio at Risk
Two metrics that drive regulatory capital requirements and provisioning decisions.
📉 Non-Performing Asset (NPA)
A loan is classified as NPA when principal or interest is overdue for more than 90 days. RBI mandates NPA classification, provisioning ratios, and disclosure. In the demo, loans with overdue_days > 90 are flagged; the dbt overdue_flag column feeds into regulatory reports.
📊 Portfolio at Risk (PAR)
PAR = outstanding balance of loans with any overdue payment / total loan portfolio. A leading indicator of credit stress used by microfinance regulators and retail banks. The demo computes PAR in the REPORT_SUMMARY table and surfaces it via the AI agent.
Compliance Controls Embedded in the Stack
The demo's RDFLib ontology annotates PII fields with GDPR sensitivity tags.
The YAML data contracts enforce freshness SLAs — a stale loan book would fail the contract and block the risk report pipeline.
The dbt customer_360 model computes overdue_flag and total_exposure — the inputs regulators need for Basel capital calculations.
Great Expectations catches the intentionally injected data quality issues (40 null branch_code, 3 negative balances) that would corrupt a regulatory submission.