I like the idea. Like Austin, I am "unsure if Mikolaj appreciates the scope of what he's trying to take on," but I wanted to modestly signal support for the effort.
You're pledging to donate if the project hits its minimum goal and gets approved. If not, your funds will be returned.
This project aims to develop Animal Charity Evaluators style cost-effectiveness estimates for AI safety organizations. I want to gather and analyze data key metrics such as people impacted, research output (e.g., papers, citations), and funding received.
As a side-product I will get a list of papers that were published by AI safety organizations and grants distributed by SFF and EAIF. While EAIF already provides an easy to use interface to browse the grants database, SFF doesn't and there is no database of AIS papers AFAIK.
My plan:
Data Collection:
Gathering publicly available data from websites and impact analyses
Scraping websites of organizations listed on the AI Safety Map to compile a comprehensive list of research papers.
Using the Semantic Scholar API to gather citation counts for these papers.
Scraping grant databases (SFF, EAIF, ACX) to include grant information.
Publishing the collected data as a separate, searchable website (hosted as a static webiste on Vercel, so free of charge).
Engagement with Organizations:
Emailing organizations to request data on people impacted (e.g., participants in their programs).
Collecting additional relevant data where feasible, such as social media interactions.
Analysis:
Comparing metrics across organizations operating under similar theories of change.
Dissemination:
Publishing findings on LessWrong and the EA Forum, engaging with the comments.
$500: Create a rough CSV file compiling publicly available data, including research papers, citations, and grants. Minimal post with analysis and no outreach to organizations for additional data.
$2,000: Publish the database in an accessible format and conduct outreach to organizations to gather data on participants and additional metrics.
$3,000: Go beyond the 80/20 principle, heavy interaction with comments, taking requests...
Re: data analysis: I helped Condor Camp to analyze their pre-camp and after-camp surveys and EA Denmark to analyze yearly community survey .
Re: writing posts: I have one on EA Forum with 100 karma.
N/A
0$
Jason
23 days ago
I like the idea. Like Austin, I am "unsure if Mikolaj appreciates the scope of what he's trying to take on," but I wanted to modestly signal support for the effort.
Austin Chen
24 days ago
Quick thoughts:
I agree with Ryan that public impact analysis for AI safety orgs would be good (or at least, I would want to skim more of them); I'm very unsure if Mikolaj is well-placed to do that, since it seems like the kind of analysis that would benefit a lot from either technical AI safety expertise, grantmaking experience, and/or insider knowledge of the TAIS landscape. I'm also unsure if Mikolaj appreciates the scope of what he's trying to take on.
That said, I think the cost of Mikolaj trying to create one is pretty low, and generally encourage people trying to do things! I'm making a small donation to encourage this.
I would be more excited if somebody else with a known track record (like Nuno or Larks) was helping with this project; though on the flip side the overhead costs of collaboration are real so idk if this is actually a good idea
I also don't know if cost-effectiveness is a particularly good metric for TAIS. Research fields don't typically use this kind of metric, where output is much trickier to measure compared to something like global health.
Because of this, I'd encourage Mikolaj to do some of this work in public, eg outline his thoughts on how to do a cost-effectiveness analysis and to do an example one, and post it so that people can critique the methodology -- before going off and collecting all the data for the final report.
Also: it's possible that much of the value of this project would just be doing the cost effectiveness analysis for one or two orgs where people have a lot of uncertainty!
Ryan Kidd
25 days ago
I want to see more public impact analysis of AI safety organizations. To my knowledge, there has been very little of this to date.
Mikolaj has some experience with impact analysis, but I suspect that this project will be quite hard and might need more funding.
I decided to partly fund this project to give other donors room to contribute and mitigate my conflict of interest.
I am Co-Director at MATS, Board-Member at LISA, and advise several AI safety projects, including Catalyze Impact and AI Safety ANZ. I often work out of FAR Labs. As a Manifund Regrantor, I regularly fund AI safety projects, the success of which is probably a determining factor in my future status as a Regrantor. To mitigate these obvious conflicts of interest, I ask that Mikolaj share his key results in a public database and post, where any discrepancies can be identified.
Mikolaj Kniejski
24 days ago
@RyanKidd Thanks! I agree, I might not be the best person to do this with my limited experience. I searched EA forum and LW and only found those CE estimates:
- CAIS
- arb's AISC estimate
- Nuno sempre work on long termist orgs
- MATS self assessment
Which is indeed limited. Given this I think it's a good idea to do this, even to just put out something and let other people work on this to improve this. Worst case scenario, this will end up being something along lines of "How good are AIS orgs at communicating their findings (measured by citations)"
To improve quality I will try to get feedback from some organizations and EAs.
Austin Chen
24 days ago
@Mikolaj-Kniejski One more assessment you might want to track is Gavin Leech's shallow review of TAIS; I expect this will overlap somewhat with yours. It seems like you're aiming to produce cost-effectiveness numbers whereas Gavin is doing more of an overview, but either way it might be worth reaching out and seeing if you can look at an advance version, to deduplicate work or share results.
Mikolaj Kniejski
26 days ago
My rough plan is to break down the theories of change that are pursued in AIS down into:
Evals
Field-building
Governance
Research
And measure impact in each category seprately.
Evals can be measured by quality and number of evals, relevance to ex-risks. It seems pretty straightforward to differentiate a bad eval org from a good eval org—engaging with major labs, having a lot of evals, and a relation to existential risks.
Field-building—having a lot of participants who do awesome things after the project.
Research—I argue that the number of citations is also a good proxy for the impact of a paper. It's definitely easy to measure and is related to how much engagement a paper received. In the absence of any work done to bring the paper to the attention of key decision makers, it's very related to the engagement.
I'm not sure how to think about governance.