Manifund foxManifund
Home
Login
About
People
Categories
Newsletter
HomeAboutPeopleCategoriesLoginCreate
mariushobbhahn avatarmariushobbhahn avatar
Marius Hobbhahn

@mariushobbhahn

regrantor

I'm the CEO of Apollo Research. An evaluations research organization focused on frontier AI safety risks like scheming. Previously, I did a PhD in ML and worked as a Fellow for Epoch

https://www.linkedin.com/in/marius-hobbhahn-128927175/

Donate

This is a donation to this user's regranting budget, which is not withdrawable.

Sign in to donate
$82,000total balance
$79,000charity balance
$0cash balance

$3,000 in pending offers

About Me

I aim to fund projects in two topics:
1. Evals in general, see e.g. here: https://x.com/MariusHobbhahn/status/1903467097260454330
2. Scheming related work. This can include evals, demos, model organisms, and more.

Broadly speaking, I intend to fund
1. Projects between 1 - 10k USD (higher in rare exceptions)
2. Small independent research projects, top-ups of existing projects, and small exploration grants.
3. Salary and compute if needed

Examples of projects I might fund:
- 3-month exploration grant for a Master's student to build evals using Inspect at 10h a week for $3k with the aim of open sourcing them.
- 2-month full-time independent research project for a model organism that extends any paper that Owain Evans published in the last 2 years.
- 5k compute top-up for an existing project. Quick turnaround needed because of NeurIPS deadline.

If you want to apply, you can either
- Fill out this form and wait for my response: https://forms.gle/DcTbjqdZNjQThyRPA
- Already make a Manifund page and ping me via the above form

I don't intend to spend a lot of time per grant (<5 mins; please make sure your application is very concise). It might also take ~1 week until I respond.

Outgoing donations

Benchmarking and comparing different evaluation awareness metrics
$3000
PENDING
1-month full-time contributing software to Inspect
$10000
16 days ago
Tooling + Model Orgs for CoT Faithfulness Research
$3000
17 days ago
Demonstration of LLMs deceiving and getting out of a sandbox
$3000
2 months ago
Split Personality Training
$2000
3 months ago

Comments

Benchmarking and comparing different evaluation awareness metrics
mariushobbhahn avatar

Marius Hobbhahn

13 days ago

Offering 3k to get this started. I think the project makes sense, but have no further evidence about the project or authors than listed here.

This donation is not a commitment to follow-up funding, even if reasonable progress is made.

1-month full-time contributing software to Inspect
mariushobbhahn avatar

Marius Hobbhahn

18 days ago

Anthony has done great work on other OS libraries for AI safety (SAELens, neuronpedia, and SAEDashboard) and I'm excited that he wants to explore building tools for evals.

I've talked to him about his plans and think it's an obviously good funding opportunity.

Tooling + Model Orgs for CoT Faithfulness Research
mariushobbhahn avatar

Marius Hobbhahn

18 days ago

I think the topic is important. I have neither done a check on the person running it nor read a more detailed project description than is visible in the grant itself.

If there are interesting results from this funding round, I might be willing to donate a larger follow-up round.

Demonstration of LLMs deceiving and getting out of a sandbox
mariushobbhahn avatar

Marius Hobbhahn

2 months ago

I think realistic demos of autonomous worrying behavior are quite important and there aren't many good examples yet.

I'm not sure whether this particular project will meet my bar for realism, but I think it's worth giving it a try. I have not vetted the project in detail. Impossible tasks as a source of deception have been explored multiple times in the literature. I think the main addition from a project like this would come from making highly realistic agentic scenarios.

Split Personality Training
mariushobbhahn avatar

Marius Hobbhahn

3 months ago

Funding with $2000 to get the project off the ground.

I have talked to Florian about this project during the last MATS cohort presentation day. I felt like his conceptual considerations were good, and the motivation makes sense.

I have no clear ability in favor or against his ability to execute projects quickly, which is why I'm keeping it at $2k.

I might consider more funding if there are good early results or other strong evidence of progress that I can easily verify. I'd recommend trying to sprint to a 4-6 week MVP and publishing or at least writing up and privately sharing the results.

Transactions

ForDateTypeAmount
1-month full-time contributing software to Inspect16 days agoproject donation10000
Tooling + Model Orgs for CoT Faithfulness Research17 days agoproject donation3000
Demonstration of LLMs deceiving and getting out of a sandbox2 months agoproject donation3000
Split Personality Training3 months agoproject donation2000
Manifund Bank4 months agodeposit+100000