The premise and theory of change is strong. There’s a need for such a platform.
The team has journalism and editorial experience.
CAIS has strong networks with academics and researchers; the project has already seen interest from a variety of external contributors.

Donor's main reservations

Sustaining a high rate of high-quality contributions may be challenging.
May require significant marketing to build readership and visibility.

Process for deciding amount

I reviewed a budget.

Conflicts of interest

I will be advising this project.

Funding a Consortium of Legal Scholars and Academics doing AI Safety

Dan Hendrycks

almost 2 years ago

Main points in favor of this grant

Interest around AI and AI safety is quite high within legal academia. Getting a symposium focused on AI safety at a top law review would greatly increase AI safety’s prestige within legal academia. The AI safety consortium contains many academics who understand the norms of academia and would be good at field-building.

Donor's main reservations

As mentioned above, it’s unclear how receptive the legal reviews will be. Given interest in AI is surging, now is the best time to pitch such a review. However, acceptance is not guaranteed.

Process for deciding amount

The consortium provided an estimate for the symposium costs, which was reviewed and approved and was broadly in-line with workshop costs in other fields.

Conflicts of interest

None.

Modeling Automated AI R&D

Dan Hendrycks

almost 2 years ago

Main points in favor of this grant

Despite AI capabilities quickly progressing towards human or superhuman level, the dynamics of an intelligence explosion or automated AI R&D haven’t been very thoroughly explored. If an intelligence explosion were to happen, humans would likely quickly lose control of the process, by default, unless precautions had been setup beforehand.

Paul Salmon has previously published highly impactful work in safety engineering and is familiar with the type of systems analysis needed to do this research. Paul is also interested in AI safety, having previously published on the topic of AGI risks.

Donor's main reservations

Whether agent-based models is the right approach remains to be seen.

Process for deciding amount

The amount regranted was comparable to other grants in the field.

Conflicts of interest

I will be helping with this project as well.

Removing Hazardous Knowledge from AIs

Dan Hendrycks

almost 2 years ago

Main points in favor of this grant

Removing hazardous capabilities from models would greatly help reduce AI x-risk from malicious use and unilateral actors. Alex is a researcher with a strong track record who is interested in AI safety and has done previous AI safety research. The timing is right; NIST has been tasked by the recent EO to help develop standards and regulations on “dual-use foundation models.” Research now has a much higher likelihood of helping shape regulation.

Donor's main reservations

This is a relatively complex project with many moving parts. It’s crucial that the project is executed well on a relatively short timeline.

Process for deciding amount

It was estimated by the researchers that this was the total amount needed for the dataset. I have reviewed the budget and approved.

Conflicts of interest

I will be helping advise this project.

Transactions

For	Date	Type	Amount
Research Staff for AI Safety Research Projects	3 months ago	project donation	+10
Research Staff for AI Safety Research Projects	8 months ago	project donation	+25
Research Staff for AI Safety Research Projects	8 months ago	project donation	+25
Manifund Bank	8 months ago	withdraw	26760
AI Safety & Society	8 months ago	project donation	250000
Research Staff for AI Safety Research Projects	about 1 year ago	project donation	+150
<70d11eb1-2a41-4bd2-9376-b5a7c969e541>	over 1 year ago	profile donation	+100
Research Staff for AI Safety Research Projects	over 1 year ago	project donation	+500
Research Staff for AI Safety Research Projects	over 1 year ago	project donation	+500
Research Staff for AI Safety Research Projects	over 1 year ago	project donation	+500
Research Staff for AI Safety Research Projects	over 1 year ago	project donation	+25000
Manifund Bank	over 1 year ago	deposit	+250000
Manifund Bank	over 1 year ago	return bank funds	210000
<d9f98ed4-417f-431f-ae33-81b538b1c3dd>	over 1 year ago	profile donation	+10
Removing Hazardous Knowledge from AIs	almost 2 years ago	project donation	190000
Manifund Bank	over 2 years ago	deposit	+400000

Comments

Research Staff for AI Safety Research Projects

Dan Hendrycks

9 months ago

Progress update

What progress have you made since your last update?

Published Tamper-Resistant Safeguards:
https://arxiv.org/abs/2408.00761
https://www.tamper-resistant-safeguards.com/
Published Circuitbreakers:
https://arxiv.org/abs/2406.04313
Published Safetywashing: a meta-analysis of AI safety
https://arxiv.org/pdf/2407.21792
https://www.safetywashing.ai/
Other projects are ongoing. New projects include ones on honesty/deception, on teaching AIs to follow the law, and so on.

What are your next steps?

Publish the superintelligence evals and virology benchmarks.

AI Safety & Society

Dan Hendrycks

10 months ago

Main points in favor of this grant

The premise and theory of change is strong. There’s a need for such a platform.
The team has journalism and editorial experience.
CAIS has strong networks with academics and researchers; the project has already seen interest from a variety of external contributors.

Donor's main reservations

Sustaining a high rate of high-quality contributions may be challenging.
May require significant marketing to build readership and visibility.

Process for deciding amount

I reviewed a budget.

Conflicts of interest

I will be advising this project.

Funding a Consortium of Legal Scholars and Academics doing AI Safety

Dan Hendrycks

almost 2 years ago

Main points in favor of this grant

Donor's main reservations

As mentioned above, it’s unclear how receptive the legal reviews will be. Given interest in AI is surging, now is the best time to pitch such a review. However, acceptance is not guaranteed.

Process for deciding amount

The consortium provided an estimate for the symposium costs, which was reviewed and approved and was broadly in-line with workshop costs in other fields.

Conflicts of interest

None.

Modeling Automated AI R&D

Dan Hendrycks

almost 2 years ago

Main points in favor of this grant

Donor's main reservations

Whether agent-based models is the right approach remains to be seen.

Process for deciding amount

The amount regranted was comparable to other grants in the field.

Conflicts of interest

I will be helping with this project as well.

Removing Hazardous Knowledge from AIs

Dan Hendrycks

almost 2 years ago

Main points in favor of this grant

Donor's main reservations

This is a relatively complex project with many moving parts. It’s crucial that the project is executed well on a relatively short timeline.

Process for deciding amount

It was estimated by the researchers that this was the total amount needed for the dataset. I have reviewed the budget and approved.

Conflicts of interest

I will be helping advise this project.

Transactions

For	Date	Type	Amount
Research Staff for AI Safety Research Projects	3 months ago	project donation	+10
Research Staff for AI Safety Research Projects	8 months ago	project donation	+25
Research Staff for AI Safety Research Projects	8 months ago	project donation	+25
Manifund Bank	8 months ago	withdraw	26760
AI Safety & Society	8 months ago	project donation	250000
Research Staff for AI Safety Research Projects	about 1 year ago	project donation	+150
<70d11eb1-2a41-4bd2-9376-b5a7c969e541>	over 1 year ago	profile donation	+100
Research Staff for AI Safety Research Projects	over 1 year ago	project donation	+500
Research Staff for AI Safety Research Projects	over 1 year ago	project donation	+500
Research Staff for AI Safety Research Projects	over 1 year ago	project donation	+500
Research Staff for AI Safety Research Projects	over 1 year ago	project donation	+25000
Manifund Bank	over 1 year ago	deposit	+250000
Manifund Bank	over 1 year ago	return bank funds	210000
<d9f98ed4-417f-431f-ae33-81b538b1c3dd>	over 1 year ago	profile donation	+10
Removing Hazardous Knowledge from AIs	almost 2 years ago	project donation	190000
Manifund Bank	over 2 years ago	deposit	+400000