Manifund foxManifund
Home
Login
About
People
Categories
Newsletter
HomeAboutPeopleCategoriesLoginCreate
wiserhuman avatarwiserhuman avatar
Francesca Gomez

@wiserhuman

Founder of Wiser Human, background in risk management, focusing on developing technical human control mechanisms for agentic AI systems which are robust to agent subversion

https://www.linkedin.com/in/francescagomez/
$0total balance
$0charity balance
$0cash balance

$0 in pending offers

About Me

I’m the founder of Wiser Human, an AI safety organisation focused on developing advanced monitoring and control systems for agentic AI. Our work interrogates the limits of existing safeguards, identifying where they fail with agentic AI so we can build verifiable, empirical defenses and understand where our limits of control lie. By exposing these limits, we aim to inform safer decisions for long-term AI safety.

Previously, I was the CEO & Co-Founder of Smarter Human (2018-2024), where I helped startups assess the ethics and compliance of their AI and machine learning developments, designing scalable governance frameworks. Prior to that, I worked in digital risk management at HSBC, Deloitte, American Express, and Tandem, helping organizations embed responsible AI and technology risk management into their products and operations.

I hold an MSc in Human-Centred Computing and a BSc in Artificial Intelligence from the University of Sussex, and published a paper on Dynamic Safety Cases for Frontier AI with the Arcadia Impact Taskforce in December 2024.

Projects

Develop technical framework for human control mechanisms for agentic AI systems

Comments

Develop technical framework for human control mechanisms for agentic AI systems
wiserhuman avatar

Francesca Gomez

17 days ago

Progress update

What progress have you made since your last update?

We began by drafting a methodology for agent-centred threat assessments that considered an agent’s autonomy, action surfaces, tool access, affordances, capabilities, propensities, and deployment scale. The goal was to recommend safeguards based on a theoretical analysis.

However, the release of Anthropic’s Agentic Misalignment study (Lynch et al., 2025) highlighted two key points:

  1. Misaligned behaviour only emerged when agents experienced goal conflicts or autonomy threats - our draft methodology did not incorporate this key consideration of 'triggers'.

  2. Without empirical testing, recommended safeguards might not actually effectively prevent such behaviours.

In response, we updated our approach. Rather than only theorising about safeguards, we first designed a set of preventative mitigations for agentic misalignment, inspired by insider risk controls, and tested them on the original Agentic Misalignment scenario to understand their real-world effectiveness.

Part 1 – Empirical Study
We conducted an empirical evaluation of five mitigation types across 10 models and 66,000 trials, using Anthropic’s original framework. We are now finalising the written report, which will be published in the next 2–3 weeks.

Part 2 – Practical Outputs
To make the results directly useful to AI safety researchers and developers and honour the original aims of this project we will also release:

  • A repeatable methodology for identifying threat models and likely pathways to agentic misalignment harm in agentic AI systems.

  • A mapping of existing safeguards for both preventative (tested empirically) and detective (monitoring) controls.

  • An analysis of safeguard gaps and potential harm areas given today’s available control portfolio.

What are your next steps?

Publish empirical testing results (in research paper) in next 2-3 weeks. We will also release our code so others can replicate or test different mitigations.

The part 2 outputs will follow after and will likely be in a blog post format.

Is there anything others could help you with?

Feedback once published and sharing with others interested in this topic.

Develop technical framework for human control mechanisms for agentic AI systems
wiserhuman avatar

Francesca Gomez

4 months ago

I submitted the info to withdraw funds at the end of April but not sure how to track progress of this. I imagine being based in UK there are some extra checks, but let me know if you need any additional info from me to process.

Develop technical framework for human control mechanisms for agentic AI systems
wiserhuman avatar

Francesca Gomez

5 months ago

Thank you! Ryan was introduced to the project while he was advising on the Catalyze Impact AI Safety incubator. Looking forward to kicking off now!

Transactions

ForDateTypeAmount
Manifund Bank4 months agowithdraw10030
Develop technical framework for human control mechanisms for agentic AI systems5 months agoproject donation+30
Develop technical framework for human control mechanisms for agentic AI systems5 months agoproject donation+10000