You're pledging to donate if the project hits its minimum goal and gets approved. If not, your funds will be returned.
Ndome is a working prototype of an independent safety scorer for AI agents. It produces a deterministic, reproducible, auditable safety scorecard — and it never needs access to the agent owner's private data. Today it spans a 7-layer engine (~25,000 lines), a library of 56 graded adversarial vectors, and a live boundary test where 28 attacks were run and 0 succeeded; every score carries a C1–C5 certainty grade and a traceable evidence trail.
The most telling result so far is a failure I caught on myself: on an early blind run — the harness couldn't see the system's internals — it found a real boundary break that my in-house tests had missed. I credited no score, fixed it, and re-verified under blind testing before anything moved. This grant funds turning that prototype into an open, documented, reproducible evaluation tool plus a public methodology write-up. I'm not asking to be believed — I'm asking for time to make the method open and independently checkable.
The infrastructure already exists; this grant opens it, it doesn't build it from scratch. The goals: (1) release an open reference implementation of the deterministic scoring engine and the C1–C5 certainty grading, runnable by others on their own infrastructure; (2) publish a documented methodology — the threat model, the scoring criteria, and why mechanical, zero-trust scoring complements ML-based evals; (3) ship a reproducibility harness with fixtures that prove "same evidence → same score"; (4) establish honest, certainty-graded, no-laundering scoring as a usable pattern for third-party or regulatory verification without exposing private data. Plan: Month 1 — methodology write-up and threat model published; Month 2 — open reference implementation released; Month 3 — reproducibility harness, fixtures, and a short demo. All of it runs black-box / zero-trust, air-gapped, with no access to private data.
Part-time engineering to package the open reference implementation — $14,000 (56%). Compute to build and validate the reproducibility harness — $5,000 (20%). Methodology write-up and documentation — $4,000 (16%). Misc (domain, hosting, tooling) — $2,000 (8%). Total — $25,000. This buys focused engineering and documentation time to open-source what already works, not initial R&D.
Ryan - solo founder, Edmonton, Alberta, Canada. I built Ndome end-to-end with no institutional affiliation and no external funding: a 7-layer security/QA engine (~25,000 lines), nightly automated regression and integrity testing, and 56 graded adversarial vectors mapped to recognised frameworks (OWASP LLM Top-10, MITRE ATT&CK / ATLAS, STRIDE, SLSA, SOC 2). The same discipline runs through the whole system: explicit C1–C5 certainty grading on every claim, deterministic and reproducible outputs, strict separation of verified fact from inference, and a hard no-laundering rule that keeps sandbox results from inflating the real score.
The most likely cause is solo-founder bandwidth — packaging, documentation, and the reproducibility harness taking longer than three months at part-time capacity. Mitigation: this grant funds the part-time engineering that closes exactly that gap, and every deliverable is open-spec and open-source, so the methodology survives the individual. A second risk is that mechanical, zero-trust scoring is seen as only a complement to ML-based evals rather than a replacement — which is true, and I'm explicit about it; it still adds an independent, reproducible, privacy-preserving check that today's evaluators don't provide. If the project fails outright, the published methodology and any released code remain citable and usable by others.
$0. The project has been entirely self-funded to date.