QA Arbiter MCP Connector for Claude
A+A test fails. Is the assertion wrong or is the code broken? Most agents guess, retry blindly, and deadlock the pipeline. QA Arbiter resolves this in one call — structured fault diagnosis with two boolean pivots that yield a deterministic verdict: TEST_ERROR, ENGINE_DEFECT, or BOTH_WRONG.
When a QA agent sees a failing test, it faces a critical question: is the test wrong, or is the code broken? Most agents skip this question entirely — they guess, retry blindly, or blame the wrong component. QA Arbiter eliminates this failure mode.
The Problem It Solves
In multi-agent pipelines, QA agents write tests and run them. When tests fail, two things can be true:
- The test's expected value is wrong — the agent miscalculated (e.g., wrote
'05:25'instead of'04:45'). - The engine has a real bug — the code produces incorrect output (e.g.,
-02:-15from a midnight crossover bug).
Without structured diagnosis, agents waste cycles: QA blames the developer → developer can't fix test code → pipeline deadlocks.
How It Works
QA Arbiter uses the Decision Pivot pattern from reasoning research. For each failing test, the agent must call this tool and fill in structured fields:
- Trace the engine function step-by-step with the test's inputs.
- Compare the vitest
Receivedvalue against the trace. - Compare the test's
Expectedvalue against the trace. - Commit to two boolean pivots:
receivedMatchesTraceandexpectedMatchesTrace. - The verdict follows deterministically from the pivots.
The tool validates logical consistency — if the agent says ENGINE_DEFECT but marked receivedMatchesTrace: true, the tool rejects the diagnosis with a clear explanation of the contradiction. The agent must re-analyze.
Why It Works
- Tool calls are obligations, instructions are suggestions. Agents routinely ignore prompt instructions to "check your work." A tool call cannot be skipped — the agent must fill every field.
- Decision Pivots eliminate guesswork. Two booleans + consistency validation = deterministic verdict. The agent cannot arrive at a wrong conclusion without contradicting itself.
- Zero computation. The tool doesn't calculate anything. It forces the agent to show its reasoning, then validates that the reasoning is internally consistent.
Related Connectors
US Seismic Zone Checker MCP
Identify Seismic Design Categories and structural detailing requirements based on US regional seismic data.
IFRS Depreciation Calculator MCP
Calculate asset depreciation schedules using IFRS/IAS 16 standards (Straight-Line, Units of Production, and SYD).
US W-4 Withholding Estimator MCP
Calculate precise federal income tax withholding per pay period to avoid penalties.
Viral Coefficient Calculator MCP
Calculate K-factor, growth status, and project user base expansion through viral loops.