Donate

This is a donation to this user's regranting budget, which is not withdrawable.

$19,500total balance

$19,500charity balance

$0cash balance

$0 in pending offers

About Me

I aim to fund projects in two topics:
1. Evals in general, see e.g. here: https://x.com/MariusHobbhahn/status/1903467097260454330
2. Scheming related work. This can include evals, demos, model organisms, and more.

Broadly speaking, I intend to fund
1. Projects between 1 - 20k USD (higher in rare exceptions)
2. Small independent research projects, top-ups of existing projects, and small exploration grants.
3. Salary and compute if needed

Examples of projects I might fund:
- 3-month exploration grant for a Master's student to build evals using Inspect at 10h a week for $3k with the aim of open sourcing them.
- 2-month full-time independent research project for a model organism that extends any paper that Owain Evans published in the last 2 years.
- 5k compute top-up for an existing project. Quick turnaround needed because of NeurIPS deadline.

If you want to apply,
- Ping me at marius.hobbhahn@gmail.com before you make a manifund page to check if I'm interested.
- OR already make a Manifund page and ping me at the above email.

I don't intend to spend a lot of time per grant (<5 mins; please make sure your application is very concise). It might also take ~1 week until I respond.

Outgoing donations

Tarbell Center for AI Journalism

$10000

about 1 month ago

Awareness-conditioned emergent misalignment

$6500

2 months ago

Lightcone Infrastructure

3 months full-time contributing software to Inspect

$10000

10 months ago

Benchmarking and comparing different evaluation awareness metrics

$3000

11 months ago

3 months full-time contributing software to Inspect

$10000

12 months ago

Tooling + Model Orgs for CoT Faithfulness Research

$3000

12 months ago

Demonstration of LLMs deceiving and getting out of a sandbox

$3000

about 1 year ago

Split Personality Training

$2000

about 1 year ago

Comments

Awareness-conditioned emergent misalignment

Marius Hobbhahn

2 months ago

I think this is a reasonable project

Lightcone Infrastructure

Marius Hobbhahn

7 months ago

Donating ~half of my remaining regranting funding ($25k). While this is not in the category of things I wanted to fund with the regranting pot (i.e. individuals working on scheming research), I think Lighthaven is worth supporting.

My main evidence is that my MATS scholars really like it and when I've been at Lighthaven for conferences or other meetings I thought it was a great venue.

One caveat is that I think Lighthaven should be funded through institutional donors rather than many small grants due to the obvious overhead considerations and large funding amounts. Though, in the current situation individual donations seem obviously better than nothing.

Inspect Evals

Marius Hobbhahn

9 months ago

I think it's great that more people work on Inspect and I've seen a few evals being either competently ported to Inspect or existing evals being significantly improved.

I'm happy to throw in 10k of my regranting money as a sign of my support, but I think the value they provide is clearly big enough that a large funder should support them with hundreds of thousands or millions of dollars in support.

Tarbell Center for AI Journalism

Marius Hobbhahn

10 months ago

I'm recommending a $10k donation as a regrantor.

Multiple journalists in the Tarbell program have previously reached out to me for news articles and, while they are often less experienced, they often ask better questions and write better articles than their senior colleagues.

With better, I specifically don't mean "they represent my opinion", but I mean
1. They have read the paper, and the questions are well-prepared.
2. The resulting texts are typically more accurate and have much fewer factual errors than the texts from my average journalist experience.
3. The questions are often critical and really trying to understand the limitations of the work rather than softball.

In general, all Tarbell journalists I have worked with so far seemed to have high integrity with a strong desire to write an accurate and interesting article for their readership rather than produce an attention-grabbing headline.

3 months full-time contributing software to Inspect

Marius Hobbhahn

10 months ago

Adding another 10k after the first month has gone well. Anthony is making fast progress on Inspect. We've discussed plans for longer-term funding, and I'll be bridging the next month until other funding is confirmed.

Benchmarking and comparing different evaluation awareness metrics

Marius Hobbhahn

11 months ago

Offering 3k to get this started. I think the project makes sense, but have no further evidence about the project or authors than listed here.

This donation is not a commitment to follow-up funding, even if reasonable progress is made.

3 months full-time contributing software to Inspect

Marius Hobbhahn

12 months ago

Anthony has done great work on other OS libraries for AI safety (SAELens, neuronpedia, and SAEDashboard) and I'm excited that he wants to explore building tools for evals.

I've talked to him about his plans and think it's an obviously good funding opportunity.

Tooling + Model Orgs for CoT Faithfulness Research

Marius Hobbhahn

12 months ago

I think the topic is important. I have neither done a check on the person running it nor read a more detailed project description than is visible in the grant itself.

If there are interesting results from this funding round, I might be willing to donate a larger follow-up round.

Demonstration of LLMs deceiving and getting out of a sandbox

Marius Hobbhahn

about 1 year ago

I think realistic demos of autonomous worrying behavior are quite important and there aren't many good examples yet.

I'm not sure whether this particular project will meet my bar for realism, but I think it's worth giving it a try. I have not vetted the project in detail. Impossible tasks as a source of deception have been explored multiple times in the literature. I think the main addition from a project like this would come from making highly realistic agentic scenarios.

Split Personality Training

Marius Hobbhahn

about 1 year ago

Funding with $2000 to get the project off the ground.

I have talked to Florian about this project during the last MATS cohort presentation day. I felt like his conceptual considerations were good, and the motivation makes sense.

I have no clear ability in favor or against his ability to execute projects quickly, which is why I'm keeping it at $2k.

I might consider more funding if there are good early results or other strong evidence of progress that I can easily verify. I'd recommend trying to sprint to a 4-6 week MVP and publishing or at least writing up and privately sharing the results.

Transactions

For	Date	Type	Amount
Tarbell Center for AI Journalism	about 1 month ago	project donation	10000
Awareness-conditioned emergent misalignment	2 months ago	project donation	6500
<9006d59f-b53b-44e6-bcbf-d0d277438990>	3 months ago	profile donation	+2000
Lightcone Infrastructure	7 months ago	project donation	25000
Inspect Evals	9 months ago	project donation	10000
3 months full-time contributing software to Inspect	10 months ago	project donation	10000
Benchmarking and comparing different evaluation awareness metrics	11 months ago	project donation	3000
3 months full-time contributing software to Inspect	12 months ago	project donation	10000
Tooling + Model Orgs for CoT Faithfulness Research	12 months ago	project donation	3000
Demonstration of LLMs deceiving and getting out of a sandbox	about 1 year ago	project donation	3000
Split Personality Training	about 1 year ago	project donation	2000
Manifund Bank	over 1 year ago	deposit	+100000