Context Engineering

Overview

What · How · Why · Where · Importance

❓

What

Context engineering is the systematic process of deciding what information an AI agent receives about your data, business rules, and schema — and how that information is structured to maximise reasoning accuracy.

⚙

How

Through: (1) Schema-aware prompts with column definitions. (2) Glossary injection — precise business term definitions. (3) Graph context — relevant subgraph retrieved for the query. (4) Provenance — showing the agent where data came from.

✔

Why

LLMs have vast general knowledge but no specific knowledge of your bank's data model. Without context, they guess column names, invent relationships, and produce plausible-but-wrong Cypher or SQL.

🏠

Where

At every AI–data boundary: Text-to-SQL agents, Text-to-Cypher agents, RAG pipelines, AI-assisted data quality rules, regulatory report drafting assistants.

⭐

Importance

In banking, a wrong AI answer about a customer's exposure or NPA status has regulatory and financial consequences. Context engineering is the difference between an AI that is useful and one that is dangerous.

The Context Stack

Layers of Context an AI Agent Needs

From most fundamental to most specific — each layer builds on the last.

Layer 1 — Schema Context

Table names, column names, data types, nullability, and primary/foreign key relationships. The agent cannot write correct SQL/Cypher without knowing the schema. Automatically generated from the data catalog.

Layer 2 — Semantic Context

Business glossary definitions injected into the system prompt. "When the user asks for 'exposure', use bal_amt + outstanding_amt. When they ask for 'NPA', filter on overdue_days > 90." Glossary-to-schema mapping eliminates ambiguity.

Layer 3 — Graph Context (Retrieval)

For graph-based queries, retrieve the relevant subgraph neighbourhood before asking the LLM to generate Cypher. If user asks about CUST_0042, pre-fetch that node's properties and relationship types — grounding the agent in real data.

Layer 4 — Provenance Context

Tell the agent which systems are authoritative for which data. "Use PostgreSQL ACCT_MASTER for live balances, MySQL ACCT_BALANCE_EOD for T-1 reporting, never use SQLite MIS for individual account data." Prevents hallucinated cross-system joins.

Layer 5 — Policy Context

Compliance constraints the agent must respect. "Do not return raw cust_id in responses — use masked form. Do not answer questions about specific customers without role check. Flag any response involving overdue_days as requiring compliance review."

🧠 Context Engineering vs Prompt Engineering

Prompt EngineeringWriting better instructions inside a single prompt. "Think step by step." "You are an expert SQL analyst." Useful but limited — can't compensate for missing factual context.
Context EngineeringSystematically curating WHAT information the agent receives — schema, glossary, retrieved facts, policy constraints. Scales. Evolves with the data model. Does not require per-query tuning.
RAG (Retrieval-Augmented Generation)Dynamically retrieving relevant context for each query from a vector store or graph. Part of context engineering, not a replacement for it — retrieval needs a high-quality knowledge base to retrieve from.
Fine-TuningEncoding domain knowledge into model weights. Expensive, hard to update, and still requires context engineering on top for dynamic data. Not the right tool for "know my bank's schema."
The Winning StackHigh-quality semantic layer (glossary + schema) + graph context retrieval + policy constraints + clear provenance → AI that answers banking questions accurately and auditably.

Architecture

Context Fiber Serving

How context is packaged and delivered to AI consumers at runtime.

📄 MCP — Model Context Protocol

Anthropic's open protocol for delivering structured context to AI agents. The knowledge catalog exposes an MCP server; agents connect and request context packages (schema, glossary, lineage) by topic or entity. Standardised, tool-agnostic.

🔁 GraphRAG

Retrieval from a knowledge graph rather than a vector store. Given "What is CUST_0042's exposure?", retrieve the CUST_0042 subgraph — Customer node, all OWNS edges, connected Account and Loan nodes with properties. Inject as structured context into the prompt.

📊 Semantic Query

Agent asks "What tables contain customer data?" — the semantic layer answers with entity-level understanding, not just keyword matching. Returns the customer_360 mart with its lineage and glossary links.

🔗 API Context Packaging

Context served as a structured JSON payload: {schema, glossary_terms, example_queries, policy_constraints}. Agents consume it before generating queries. Versioned alongside the data contracts — context changes when schema changes.

🔧 In the Demo

Context Engineering in `agent/agent.py`

The demo's AI agent implements context engineering in its system prompt: it injects the Neo4j schema (node labels, property names, relationship types), the business glossary definitions for total_exposure, portfolio_at_risk, and overdue_flag, and the provenance context ("use Neo4j graph for relationship queries, not the raw PostgreSQL tables"). When a user asks a natural language question, Ollama receives this rich context and generates accurate Cypher — then the agent returns the answer with node-level provenance.

← Semantic Layer AI Agents →