Every Concept Has a Running Implementation
This page is the bridge between theory and practice. Click any concept to see its detailed explanation.
| Concept | Tool in Demo | Files | Learn More |
|---|---|---|---|
| Legacy Data Sources | PostgreSQL, MySQL, SQLite (Docker) | docker-compose.yml | Architecture Governance |
| Mock Banking Data | Python Faker — 1 000 customers, 10 000 txns | mock_data/generate_data.py | Compliance |
| Data Catalog & Lineage | OpenMetadata (Docker) | openmetadata/connectors/ | Architecture Governance |
| Data Quality Profiling | Great Expectations | great_expectations/ | Data Contracts |
| Data Contracts | YAML + validate_contracts.py | contracts/*.yml | Data Contracts |
| Semantic Model | dbt — staging + customer_360 mart | dbt_project/ | Semantic Layer |
| Knowledge Graph | Neo4j (Docker) + Python loader | knowledge_graph/load_graph.py | Knowledge Graph |
| Ontology | RDFLib (Python) — OWL class hierarchy | ontology/banking_ontology.py | Semantic Layer |
| AI Agent | Ollama (Llama3/Mistral) + Neo4j Cypher | agent/agent.py | AI Agents |
From Zero to Working Demo
Run these commands in order. Each maps to a concept layer in the architecture.
① Start Infrastructure
Spin up all databases + OpenMetadata + Neo4j:
cd banking-demo
docker compose up -d
# Wait ~60s for all services to be healthy
docker compose ps
② Generate Mock Banking Data
Creates 1 000 customers, 1 000 accounts, 500 loans, 10 000 transactions across all three databases. Idempotent — safe to run twice.
pip install -r requirements.txt
python mock_data/generate_data.py
# Intentional issues injected: 40 null branch_codes, 3 negative balances
③ Run Data Contracts — WOW MOMENT 1
Watch the validator catch the injected quality issues before they reach downstream consumers.
python contracts/validate_contracts.py
# Expected output:
# [FAIL] savings_account — quality_rule: no_nulls on branch_code (40 violations)
# [FAIL] savings_account — quality_rule: value_range on bal_amt (3 negative values)
# [PASS] loan_position — all rules passed
④ Run Great Expectations Profiling — WOW MOMENT 2
Automated profiling discovers distributions, anomalies, and the T-1 inconsistency between PostgreSQL and MySQL balances.
great_expectations checkpoint run banking_checkpoint
# Opens HTML report: distributions, null counts, anomaly flags
⑤ Build Semantic Layer with dbt
Staging models + customer_360 mart with total_exposure, overdue_flag, and balance_category.
cd dbt_project dbt deps dbt run dbt test cd ..
⑥ Load Knowledge Graph
Loads all customers, accounts, loans, and branches into Neo4j. Creates OWNS, HELD_AT, DISBURSED_AT relationships.
python knowledge_graph/load_graph.py # Output: 1000 customers, 1500 accounts, 500 loans, N branches loaded # Visit http://localhost:7474 to browse the graph visually
⑦ Run the AI Agent — WOW MOMENT 3
Natural language questions answered by Ollama-generated Cypher with node-level provenance.
# Make sure Ollama is running: ollama serve python agent/agent.py You: What is customer CUST_0042's total financial exposure? Agent: Generating Cypher... MATCH (c:Customer {cust_id: 'CUST_0042'})-[:OWNS]->(a) RETURN c.name, sum(coalesce(a.bal_amt,0) + coalesce(a.outstanding_amt,0)) AS total Answer: Rajesh Kumar has total exposure of ₹4,82,310 Nodes traversed: Customer[CUST_0042] → SavingsAccount[SAV_0042] → LoanAccount[LON_0021] Source: Neo4j graph (loaded from PostgreSQL ACCT_MASTER + LOAN_HDR, 2026-05-15)
🌟 The Three "Wow Moments" for a Client Walkthrough
- Wow 1 — Contract Catches It"We intentionally injected 40 null branch codes and 3 negative balances. The contract validator catches them in 2 seconds — before they reach the risk report. Without this, they'd silently corrupt the regulatory submission."
- Wow 2 — Profiling Finds the Inconsistency"Great Expectations shows the balance distribution gap between PostgreSQL (live) and MySQL (T-1). This 3-system inconsistency is invisible to the naked eye — profiling makes it measurable and alertable."
- Wow 3 — Agent Answers with Provenance"A non-technical user asked a natural language question. The agent generated Cypher, ran it against the knowledge graph, and returned the answer with a node-level audit trail. No SQL training required. Fully governed."
Project File Structure
banking-demo/ docker-compose.yml # PostgreSQL + MySQL + Neo4j + OpenMetadata .env # All secrets (never committed) requirements.txt # Pinned Python dependencies mock_data/ generate_data.py # Idempotent data generator contracts/ savings_account.yml # Schema + freshness + quality rules loan_position.yml validate_contracts.py # PASS/FAIL validator dbt_project/ models/staging/ # stg_acct_master, stg_loan_hdr, stg_txn_log models/mart/ customer_360.sql # total_exposure, overdue_flag, balance_category metrics/ portfolio_at_risk.yml # Semantic metric definition great_expectations/ expectations/ # Per-table expectation suites checkpoints/ # banking_checkpoint.yml ontology/ banking_ontology.py # RDFLib OWL: BankAccount hierarchy + GDPR annotations knowledge_graph/ load_graph.py # Loads Neo4j from PostgreSQL (idempotent MERGE) agent/ agent.py # Ollama CLI agent: NL → Cypher → answer + provenance openmetadata/ connectors/ # Ingestion configs for all 3 databases docs/ # This GitHub Pages site