Manifund foxManifund
Home
Login
About
People
Categories
Newsletter
HomeAboutPeopleCategoriesLoginCreate
hendrycks avatarhendrycks avatar
Dan Hendrycks

@hendrycks

regrantor

Executive Director of the Center for AI Safety

danhendrycks.com/

Donate

This is a donation to this user's regranting budget, which is not withdrawable.

Sign in to donate
$50total balance
$50charity balance
$0cash balance

$0 in pending offers

Projects

Research Staff for AI Safety Research Projects

Outgoing donations

AI Safety & Society
$250000
3 months ago
Removing Hazardous Knowledge from AIs
$190000
over 1 year ago

Comments

Research Staff for AI Safety Research Projects
hendrycks avatar

Dan Hendrycks

3 months ago

Progress update

What progress have you made since your last update?

  • Published Tamper-Resistant Safeguards:
    https://arxiv.org/abs/2408.00761
    https://www.tamper-resistant-safeguards.com/

  • Published Circuitbreakers:
    https://arxiv.org/abs/2406.04313

  • Published Safetywashing: a meta-analysis of AI safety
    https://arxiv.org/pdf/2407.21792
    https://www.safetywashing.ai/

  • Other projects are ongoing. New projects include ones on honesty/deception, on teaching AIs to follow the law, and so on.

What are your next steps?

  • Publish the superintelligence evals and virology benchmarks.

AI Safety & Society
hendrycks avatar

Dan Hendrycks

5 months ago

Main points in favor of this grant

  • The premise and theory of change is strong. There’s a need for such a platform.

  • The team has journalism and editorial experience.

  • CAIS has strong networks with academics and researchers; the project has already seen interest from a variety of external contributors.

Donor's main reservations

  • Sustaining a high rate of high-quality contributions may be challenging.

  • May require significant marketing to build readership and visibility.

Process for deciding amount

I reviewed a budget.

Conflicts of interest

I will be advising this project.

Funding a Consortium of Legal Scholars and Academics doing AI Safety
hendrycks avatar

Dan Hendrycks

over 1 year ago

Main points in favor of this grant

Interest around AI and AI safety is quite high within legal academia. Getting a symposium focused on AI safety at a top law review would greatly increase AI safety’s prestige within legal academia. The AI safety consortium contains many academics who understand the norms of academia and would be good at field-building.

Donor's main reservations

As mentioned above, it’s unclear how receptive the legal reviews will be. Given interest in AI is surging, now is the best time to pitch such a review. However, acceptance is not guaranteed.

Process for deciding amount

The consortium provided an estimate for the symposium costs, which was reviewed and approved and was broadly in-line with workshop costs in other fields.

Conflicts of interest

None.


Modeling Automated AI R&D
hendrycks avatar

Dan Hendrycks

over 1 year ago

Main points in favor of this grant

Despite AI capabilities quickly progressing towards human or superhuman level, the dynamics of an intelligence explosion or automated AI R&D haven’t been very thoroughly explored. If an intelligence explosion were to happen, humans would likely quickly lose control of the process, by default, unless precautions had been setup beforehand. 

Paul Salmon has previously published highly impactful work in safety engineering and is familiar with the type of systems analysis needed to do this research. Paul is also interested in AI safety, having previously published on the topic of AGI risks.

Donor's main reservations

Whether agent-based models is the right approach remains to be seen.

Process for deciding amount

The amount regranted was comparable to other grants in the field.

Conflicts of interest

I will be helping with this project as well.

Removing Hazardous Knowledge from AIs
hendrycks avatar

Dan Hendrycks

over 1 year ago

Main points in favor of this grant

Removing hazardous capabilities from models would greatly help reduce AI x-risk from malicious use and unilateral actors. Alex is a researcher with a strong track record who is interested in AI safety and has done previous AI safety research. The timing is right; NIST has been tasked by the recent EO to help develop standards and regulations on “dual-use foundation models.” Research now has a much higher likelihood of helping shape regulation.

Donor's main reservations

This is a relatively complex project with many moving parts. It’s crucial that the project is executed well on a relatively short timeline.

Process for deciding amount

It was estimated by the researchers that this was the total amount needed for the dataset. I have reviewed the budget and approved.

Conflicts of interest

I will be helping advise this project.

Transactions

ForDateTypeAmount
Research Staff for AI Safety Research Projects2 months agoproject donation+25
Research Staff for AI Safety Research Projects2 months agoproject donation+25
Manifund Bank3 months agowithdraw26760
AI Safety & Society3 months agoproject donation250000
Research Staff for AI Safety Research Projects9 months agoproject donation+150
<70d11eb1-2a41-4bd2-9376-b5a7c969e541>10 months agoprofile donation+100
Research Staff for AI Safety Research Projects11 months agoproject donation+500
Research Staff for AI Safety Research Projects11 months agoproject donation+500
Research Staff for AI Safety Research Projects11 months agoproject donation+500
Research Staff for AI Safety Research Projects11 months agoproject donation+25000
Manifund Bankabout 1 year agodeposit+250000
Manifund Bankabout 1 year agoreturn bank funds210000
<d9f98ed4-417f-431f-ae33-81b538b1c3dd>about 1 year agoprofile donation+10
Removing Hazardous Knowledge from AIsover 1 year agoproject donation190000
Manifund Bankalmost 2 years agodeposit+400000