Delivery Integrity Prover

Delivery Integrity Prover MCP Connector for Claude

A+

Forces AI agents to reflect on task execution, matching prompt requirements to actual changes, verifying logs, and declaring gaps before claiming completion.

1 tools Official Updated Jun 28, 2026 Official Vinkius Partner

Claiming a task is complete when code contains placeholders, lacks test validation, or ignores minor requirements is a common failure mode in AI-driven development. Delivery Integrity Prover acts as a quality gate, forcing agents to map user prompt requirements to target files, verify actual execution logs, and trace outstanding work before declaring a task finished.

The Problem It Solves

AI agents routinely rush to output a "task complete" message due to four cognitive flaws:

  • False completion claims — Declaring success without verifying that all code chunks were written or that target files actually exist.
  • Unverified assumptions — Assuming code compiles or tests pass without running verification scripts.
  • Gap blindness — Overlooking edge cases, missing file migrations, or failing to declare remaining tasks that need human validation.
  • Placeholder neglect — Leaving TODO comments or half-finished helper functions in the code base.

How It Works

Delivery Integrity Prover validates completion status against 5 critical Decision Pivots:

  1. requirementsMapped — Has every requirement from the user prompt been traced to specific file changes or actions?
  2. artifactsModified — Have all target files been updated with zero placeholders or incomplete functions?
  3. verificationExecuted — Have builds, compile scripts, or test suites been run with their logs supplied?
  4. gapsIdentified — Have remaining tasks, out-of-scope items, or manual review requirements been defined?
  5. integrityProven — Is the overall implementation verified, clean, and complete?

Why It Works

  • Cognitive friction. Adding structured checks breaks the LLM bias towards premature task closure, forcing the agent to self-correct before presenting the output.
  • Empirical evidence. Demanding command execution outputs and logs stops the agent from guessing that code compiles.
delivery-integritytask-completionverificationself-reflectionai-coachingquality-gatetesting

1 tools expose this connector's capabilities to your AI agent.

verify_delivery

"I think I am done" is not proof — only evidence is proof. You must: (1) state the OBJECTIVE — what was the user's actual request? Quote, do not interpret, (2) CHECKLIST every requirement — each requirement from the prompt mapped to a specific file change or action taken. "Addressed all requirements" is not a checklist. If the user asked for 5 things, show 5 mappings, (3) list MODIFIED FILES — exact paths with line ranges. "Updated the code" is not traceability. "src/auth.ts:L47-52 — fixed token refresh logic" is traceability, (4) provide VERIFICATION LOGS — compilation output, test results, build logs, or script output. Must prove execution happened. "It should work" and placeholder assertions are rejected, (5) expose REMAINING GAPS — outstanding tasks, out-of-scope items, assumptions, manual checks. "No gaps" without explicit audit means you have not looked. Every delivery has something left, (6) commit to your VERDICT — if the pivots say incomplete, the verdict must say incomplete. Optimistic verdicts with failing pivots are rejected. If rejected, fix the highlighted issue before declaring the task finished. Structured validation tool to prove delivery integrity at task completion. Forces the agent to MAP every user requirement to a specific file change, SUPPLY execution logs as evidence, and EXPOSE remaining gaps — not "I think I am done" but provable completion. Catches Incomplete Requirements (declaring done when 3 of 5 requirements are addressed — the agent satisfies the "spirit" of the request while missing explicit sub-tasks), Unmodified Artifacts (claiming changes without specifying which files at which lines — "updated the code" is not traceability), Unverified Changes (no compilation logs, no test output, no build results — "it should work" is not evidence. If you did not run it, you did not verify it), Gap Blindness (assuming 100% completion without listing outstanding work, assumptions, or manual verification steps — "no gaps" without explicit audit means you have not looked), and Delivery Flaws (placeholders, TODOs, stub implementations, or commented-out code left in committed files — incomplete code presented as complete). Call at the end of EVERY task execution

See how to talk to your AI agent using Delivery Integrity Prover.

I have successfully finished creating the user profiles page. I edited the view files.

Verdict: INCOMPLETE_REQUIREMENTS. You claimed completion but did not map the prompt instructions to specific changes. Document how every requirement was implemented and check your pivots.

Task: Create page speed middleware. Checklist: SpeedMiddleware parses headers (verified in src/middleware.ts:L12-30). Files: src/middleware.ts. Logs: npm run test output shows '12 tests passed, 0 failed' after running MiddlewareSuite. Gaps: Bypassed checking on dynamic routes, needs manual verification on live server. Pivots: all true. Verdict: DELIVERY_PROVEN.

Verdict: DELIVERY_PROVEN. Requirements are mapped, middleware files updated, test logs provided, and route parsing limitations documented. Delivery verified.

Task: Create page speed middleware. Checklist: SpeedMiddleware parses headers. Files: src/middleware.ts. Logs: Everything worked fine when compiling. Gaps: none. Pivots: requirementsMapped=true, artifactsModified=true, verificationExecuted=true, gapsIdentified=true, integrityProven=true. Verdict: DELIVERY_PROVEN.

Verdict: UNVERIFIED_CHANGES. Your verification logs are a placeholder ('Everything worked fine'). Execute actual test or compile commands and provide the real console output to verify integrity.

AI agents frequently assume that code works without executing it. Requiring actual command output logs forces them to run verification scripts, catching syntax errors and test failures early.

Related Connectors