People Management Prover

People Management Prover MCP Connector for Claude

A+

A hiring plan listed 'culture fit' as the primary criterion. That's not a criterion — that's a bias proxy. People Management Prover forces job-related criteria, adverse impact analysis, and validated assessment methods grounded in I-O psychology.

1 tools Official Updated Jun 28, 2026 Official Vinkius Partner

AI agents produce HR recommendations that sound professional but fail under legal scrutiny. They evaluate candidates on 'culture fit' — a term that correlates with interviewer similarity bias (Rivera, 2012). They recommend hiring decisions without structured criteria. They give feedback like 'needs improvement' without specifying what to improve or how.

The Problem It Solves

AI-generated HR reasoning fails for five specific reasons:

  • Criteria absence — Evaluating candidates without job-related, measurable criteria defined before assessment. 'Looking for the best candidate' is a wish, not a selection framework.
  • Bias blindness — Ignoring adverse impact analysis. 'We hire on merit' without calculating selection rates by protected group is an assumption, not an audit. Bohnet (2016) demonstrated standardized criteria reduce gender bias 25-46% — but only with structured scoring.
  • Legal ignorance — Jurisdiction-blind recommendations. Title VII applies at 15+ employees in the US. GDPR Art. 22 governs automated decisions in the EU. CLT governs all employment in Brazil. The AI doesn't know which law applies — and doesn't ask.
  • Assessment theater — Using unvalidated methods. Schmidt & Hunter (1998) show structured interviews (r=0.51) dramatically outperform unstructured (r=0.38) and gut feeling (near-zero validity). 'We use interviews' is a category, not a method.
  • Feedback vacuum — Hattie's meta-analysis shows praise (d=0.09) has near-zero impact on performance. Effective feedback is behavioral, criteria-referenced, and developmental — not 'good job.'

Key Benefits

  • Forces job-related criteria — Every evaluation criterion must trace to an essential job function, required KSA, behavioral indicator, and anchored scoring rubric.
  • Audits for adverse impact — Selection rates by protected group, 4/5ths rule analysis, pipeline-stage examination, and documented mitigations.
  • Verifies legal compliance — Identifies the governing jurisdiction, applicable statutes, protected classes, and prohibited inquiry areas before any recommendation.
  • Demands validated assessment — Predictive validity coefficients, structured scoring, and inter-rater reliability. No more gut feeling decisions.
  • Requires developmental feedback — Behavioral, criteria-referenced, forward-looking guidance per Hattie & Timperley's feed up/feed back/feed forward model.

Framework Coverage

  • Schmidt & Hunter (1998) — Selection method validity meta-analysis
  • Bohnet (2016) — Gender-bias reduction through structured evaluation
  • Hattie & Timperley (2007) — Feedback effect sizes
  • EEOC Uniform Guidelines — Adverse impact and selection procedures
  • Title VII / ADA / ADEA — US anti-discrimination
  • GDPR Art. 22 / EU AI Act — Automated decision-making
  • CLT / Lei 9.029 — Brazil labor law
  • Equality Act 2010 — UK anti-discrimination
hrrecruitmenthiringpeople-managementbiasdiversityemployment-lawassessmentfeedbackperformanceproverreasoning

1 tools expose this connector's capabilities to your AI agent.

validate_people_management

You must: (1) define job-related CRITERIA — derived from essential job functions, behaviorally anchored, defined BEFORE evaluation begins. "Culture fit" is not a criterion — name the observable behavior, (2) AUDIT for bias — adverse impact analysis using the 4/5ths rule, pipeline-stage analysis, and documented mitigations. "We treat everyone equally" is not a bias audit — it is a measurement refusal, (3) verify LEGAL compliance — name the jurisdiction, the statute, the protected classes, and the prohibited inquiries. "We follow the law" without specifics is non-compliance, (4) use VALIDATED assessment methods — cite predictive validity coefficients from Schmidt & Hunter or equivalent meta-analyses. "We use interviews" is not validation — structured behavioral interviews r = 0.51, (5) design DEVELOPMENTAL feedback — behavioral, criteria-referenced, forward-looking. Feed Up + Feed Back + Feed Forward. Not "good job" and not "needs improvement." If rejected, the HR decision has a structural deficiency. Fix before concluding. Structured reflection tool for people management and recruitment reasoning — forces job-analytic criteria definition, adverse impact analysis, jurisdictional legal compliance, psychometric assessment validation, and developmental feedback design before any hiring, evaluation, or workforce decision. Grounded in I-O Psychology (Schmidt & Hunter, 1998), employment law, and Hattie's feedback research. Catches Criteria Absent (evaluating candidates without defined, job-related criteria — "she just didn't feel like the right fit." This is not assessment — it is bias cosplaying as intuition. Unstructured interviews have a predictive validity of r = 0.20 (Schmidt & Hunter, 1998). That means 96% of the variance in job performance is NOT explained by the interview. A structured behavioral interview with anchored scoring: r = 0.51. Work samples: r = 0.54. The difference: structured interviews define CRITERIA before the interview, ask the same questions to every candidate, and score responses against behavioral anchors. "Culture fit" and "gut feeling" are not criteria — they are proxies for "similar to me" bias), Bias Blind (hiring processes that produce disparate impact without detection — the 4/5ths rule (EEOC Uniform Guidelines): if the selection rate for a protected group is less than 4/5ths (80%) of the rate for the highest-scoring group, adverse impact exists. Example: 50 male applicants, 20 selected (40% rate). 30 female applicants, 6 selected (20% rate). 20% / 40% = 0.50 — well below the 0.80 threshold. Adverse impact is present. This does not necessarily mean illegal discrimination — but it DOES mean the process requires validation evidence (business necessity defense). "We hire on merit" is not a bias audit — it is a refusal to measure), Legal Non-Compliance (making employment decisions without knowing the applicable law — a US company asks candidates "When did you graduate college?" in an interview. This question is a proxy for age — a violation of the Age Discrimination in Employment Act (ADEA). A Brazilian company terminates an employee without 30-day written notice (aviso prévio) — CLT Art. 487 violation, entitles employee to one month's salary plus indemnity. A UK employer rejects a candidate because of a spent criminal conviction — Rehabilitation of Offenders Act 1974 violation. Employment law varies by jurisdiction, by protected class, and by stage of employment. "We follow the law" without naming WHICH law for WHICH jurisdiction is not compliance), Assessment Unvalidated (using assessment methods with no predictive validity evidence — "we use personality tests." Which ones? The MBTI has a test-retest reliability of 0.39-0.76 and near-zero predictive validity for job performance (Pittenger, 2005). Structured behavioral interviews: r = 0.51. Cognitive ability tests: r = 0.51. Work samples: r = 0.54. Assessment centers: r = 0.37. Graphology (handwriting analysis — still used in France): r = 0.02. If the assessment method cannot predict job performance, it is theater, not selection), and Feedback Empty (performance feedback that is vague, personality-focused, or backward-looking — "You need to be more proactive." This is a personality label, not actionable feedback. What does "proactive" look like in behavior? "In the last 3 sprint planning sessions, you waited for others to identify blockers. In the next session, I'd like you to present your risk assessment before the team discusses tasks." That is behavioral, specific, and forward-looking. Hattie: task-specific feedback d = 0.70. Praise d = 0.09. Punishment d = 0.20. Separate evaluation conversations from development conversations — combining them ensures employees hear neither). Call once per hiring decision, performance evaluation, or workforce planning action

See how to talk to your AI agent using People Management Prover.

We interviewed 5 candidates for the product manager role. We liked Sarah best — great culture fit and really passionate. Let's extend the offer.

CRITERIA_ABSENT — 'Culture fit' and 'passionate' are not job-related criteria — Rivera (2012) shows they correlate with interviewer similarity bias. Before selecting anyone: define the essential job functions, the KSAs required, behavioral indicators for each, and a scoring rubric. Then re-evaluate ALL candidates against the same rubric.

Design a hiring process for senior engineers at a 200-person US tech company. Use structured behavioral interviews with STAR scoring, work sample tests, and blind resume review. Calculate adverse impact quarterly.

HR_PROVEN — Structured behavioral interviews (r=0.51 per Schmidt & Hunter), work samples (r=0.54), blind resume screen mitigates name/gender bias, STAR scoring provides behavioral anchors. Title VII applies at 200 employees. Quarterly adverse impact analysis with 4/5ths rule across pipeline stages. Process is evidence-based and legally defensible.

Write a performance review for John: 'John is a great team player and always brings positive energy. He meets expectations and should keep doing what he's doing.'

FEEDBACK_EMPTY — 'Team player,' 'positive energy,' and 'meets expectations' are personality labels, not feedback. Hattie shows praise d=0.09 — near-zero impact. Replace with: specific behaviors observed, criteria they met or missed, and one concrete development action for next quarter.

Because culture fit is a subjective construct that correlates with interviewer similarity bias and lacks predictive validity, acting as a proxy for unlawful discrimination.

Related Connectors