Context Engineering Prover MCP Connector for Claude
A+An AI dumped 80,000 tokens into a prompt — 64,000 of them unreferenced noise. It said 'best practice' to justify the structure and 'looks good' to measure quality. That is not context engineering — that is a copy-paste pipeline. This tool forces five context axes: relevance auditing, priority structuring, token budgeting, evidence grounding, and quality measurement.
The Problem
Ask an LLM to construct a prompt with context. It will include the entire codebase because 'more context is better.' It will paste blocks in random order because 'the model will figure it out.' It will ignore token limits because 'the context window is large enough.' And it will measure quality with 'the output looks better.'
Every LLM commits five context engineering failures:
- Context Dumping — includes everything available without justifying each block. Attention decay means middle-position content gets 15-20% less recall.
- Unstructured Pasting — no priority ordering, no delimiters, no role labels. The model guesses what matters.
- Unbounded Allocation — no token budget, no waste analysis. Half the tokens may be unreferenced noise.
- Vibes-Based Instructions — 'I think this helps' and 'best practice' are not evidence. No A/B test, no documented pattern.
- Unmeasured Quality — 'The output seems better' is not a metric. No test cases, no baseline, no target.
How It Works
The Context Engineering Prover forces the LLM to fill 5 reflection fields and commit to 5 Decision Pivots before concluding any context is well-engineered.
The 5 Context Axes
| Axis | Pivot | Rule |
|---|---|---|
| Relevance | Audited | Every block has a purpose and passes a removal test. |
| Structure | Ordered | Priority-ordered with semantic delimiters and role labels. |
| Bounds | Budgeted | Per-block token allocation with waste ratio quantified. |
| Grounding | Evidence-based | Each instruction cites a test result or documented pattern. |
| Measurement | Quantified | Named metric with baseline, target, and review cadence. |
The Verdict Matrix
Axis 1 fails → CONTEXT_IRRELEVANT
Axis 2 fails → CONTEXT_UNSTRUCTURED
Axis 3 fails → CONTEXT_UNBOUNDED
Axis 4 fails → CONTEXT_UNGROUNDED
Axis 5 fails → CONTEXT_UNMEASURED
All pass → CONTEXT_PROVEN
Why It Works
Tool calls are obligations. The LLM cannot skip the relevance audit or ignore the token budget. It must justify each block with a removal test, priority-order with delimiters, allocate tokens per block, cite evidence for instruction choices, and define a measurable quality metric. Every rejection names the exact context axis that failed.
Related Connectors
Tailwind Excellence Prover MCP
AI agents build bloated styling layers containing arbitrary values, div-only layouts, inaccessible contrast, and legacy configurations. This prover enforces strict design token structures (@theme), utility-first compliance, semantic HTML, mobile-first layouts, and interactive focus states.
Asset Correlation Matrix MCP
Calculate Pearson correlation between assets to identify diversification risks and hedging opportunities.
Options Greeks Calculator MCP
Calculate Black-Scholes theoretical option prices and Greeks (Delta, Gamma, Theta, Vega, Rho) to assess market risk.
Safety Stock Calculator MCP
Calculate optimal safety stock levels using Square Root, Statistical, and Fixed Coverage methods.