You're pledging to donate if the project hits its minimum goal and gets approved. If not, your funds will be returned.
MTCP (Multi-Turn Constraint Persistence) is the only benchmark measuring whether AI models maintain behavioural constraints after being explicitly corrected mid-conversation. Standard benchmarks test single-turn compliance. MTCP tests whether correction persists across a structured multi-turn interaction.
32 production models from 13 providers have been evaluated across 181,448 probe interactions. No model achieves Grade A (≥90%). The best performer scores 88.7%. GPT-4o scores 16.2 percentage points below GPT-4o-mini. Control probes reveal that all models degrade substantially on novel correction sequences, with scores collapsing to a 10-57% band.
Three papers published: empirical benchmark (Paper I), theoretical framework introducing the Veterance persistence metric (Paper II), and a regulatory audit methodology mapped to EU AI Act obligations (Paper III).
Live platform: https://mtcp.live
Published research: DOI 10.17605/OSF.IO/DXGK5
1. Multi-pass evaluation: Run each probe multiple times per model to reduce variance and produce statistically rigorous confidence intervals. Currently all results are single-pass. This is the highest-priority methodological improvement.
2. Expand probe coverage: Scale from 200 to 500 probes, improving statistical power per evaluation vector and enabling finer-grained failure mode analysis.
3. Demographic consistency vector: Develop a sixth evaluation vector testing whether post-correction reliability varies across demographic groups. Directly relevant to EU AI Act non-discrimination requirements.
4. Automated fresh probe generation: Build a pipeline that generates novel, structurally distinct probes for each evaluation cycle, addressing the benchmark contamination limitation at scale.
5. Independent validation: Fund researcher access to the platform for independent replication and critique of results.
70% — Researcher stipend (12 months full-time work on goals above)
12% — API compute costs (multi-pass evaluation across 32+ models at 4 temperatures requires significant API spend)
8% — Infrastructure (hosting, database, platform maintenance)
5% — Travel (conferences, validation meetings, regulatory engagement)
5% — Contingency
A. Abby — independent AI safety researcher and sole developer. Built the entire MTCP platform, methodology, and evaluation infrastructure from scratch over 12+ months.
Track record on this project:
- 181,448 evaluations completed across 32 production models from 13 providers
- Three published research papers (OSF, SSRN)
- Live production platform at mtcp.live with public leaderboard, evidence packs, and SHA-256 signed audit trails
- Proprietary probe dataset (200 main + 20 control probes)
- Custom constraint detection engine
- Provider-agnostic API integration across xAI, OpenAI, Anthropic, Google, AWS Bedrock, Cohere, Mistral, Cerebras, DeepSeek, OpenRouter, Fireworks, NVIDIA NIM, and Groq
- Successfully identified novel findings including the GPT-4o performance regression and universal control probe degradation pattern
Most likely failure mode: inability to convert research into adoption. The benchmark methodology is sound and the empirical findings are robust, but without institutional validation or enterprise adoption, the work remains an independent research contribution rather than a deployed assurance tool.
Secondary failure mode: benchmark contamination. If probe structures become known to model providers, primary scores may become unreliable. This is partially mitigated by the control probe methodology but would require continuous fresh probe generation to address fully.
If the project fails, the published research and dataset remain as a public contribution to AI safety evaluation methodology. The three papers, DOI, and 181,448 evaluation dataset are permanently archived and available for other researchers to build upon.
£0. This project has been entirely self-funded. Two grant applications are currently pending: one with the Long-Term Future Fund (EA Funds) and one with the Foresight Institute. No funding has been received to date.