Approving as this project is within our scope and doesn't seem likely to cause harm. I appreciate Kabir's energy and will be curious to see what the retrospective on the event shows!
AI-Plans.com is a contributable compendium of alignment plans and the criticisms against them. We currently have over 100 alignment plans on the site and are in the process of adding more. Several alignment researchers, including Tom Everrit, Dan Hendrycks, and Stuart Russell, are interested in the site and other researchers have already been finding useful papers on the site and submitting plans to the site.
We are hosting a critique-a-thon on the 1st of August, which will last for 10 days, with a prize fund of $500.
The goal of this project is to encourage high-quality critiques of alignment plans on AI-Plans.com. Critiques will be judged on accuracy, precision, communication, evidence, and novelty by myself, members of the team, and a couple of alignment researchers. The top three critiques will be awarded from the prize fund, with two honorable mentions also receiving a prize.
The funding will be used to award the top critiques from the critique-a-thon. This will incentivize high-quality critiques and help improve the content on AI-Plans.com. The prize fund will be split as follows:
1st place: $200
2nd place: $125
3rd place: $75
Honorable mention 1: $50
Honorable mention 2: $50
Our team includes Jonathan Ng, an alignment researcher, who will be taking a look at some critiques. Other members of the team include an expert QA with many years of experience and Azai, who has a strong background in mathematics. We also have a consultant who is CompTia certified, highly skilled in cybersecurity, red-teaming and is also a professor.
Dr Peter S. Park, an MIT postdoc in the Tegmark lab has agreed to be a judge.
We have successfully launched AI-Plans.com in beta and have already added over 100 alignment plans to the site.
I myself, have helped get a start-up off the ground from nothing, going door to door, to have 3 branches, thousands of customers and schools requesting for internships. During the process I saw a lot of other start-ups, with much more qualified people fail completely and learnt what it takes to fail(overconfidence in the product, lack of outreach and market research, laziness, many things) and what it takes to succeed- determination and a sharp, user-focused mind.
I have been assisting Stake Out AI with narrative-building and proofreading and helping out at VAISU as well. I'm confident in my skill of breaking down the reasons an idea can and will fail and then finding ways to reach into it and extract something valuable.
If this project fails, it could be due to a lack of participation or low-quality submissions. This could result in less content being added to AI-Plans.com and slower progress towards our goal of creating a comprehensive compendium of alignment plans and criticisms. Despite being the most likely form of failure, it's not very likely, since we already have 10 plus participants, on the day we announced the critique-a-thon.
This project and AI-plans.com are currently unfunded passion projects. The requested $500 for prizes would be the first and only external funding for efforts on the site thus far.
Austin Chen
over 1 year ago
Approving as this project is within our scope and doesn't seem likely to cause harm. I appreciate Kabir's energy and will be curious to see what the retrospective on the event shows!
Kabir Kumar
over 1 year ago
Awesome!! Thank you very much!! You might be interested to know, that not only has the event produced many very well thought out critiques and got more people involved and interested in AI Safety- especially in what actually goes into making a plan robust- in the first two days we produced an extremely useful document: https://docs.google.com/document/d/1GQbAnRPvONF8TdQtQuga4WOLk58iNh3tTdsVyGpA4AE/edit?usp=sharing multiple people have talked about how useful and easy to use this document is and often expressing confusion as to why no one has made something like it before!
Kabir Kumar
over 1 year ago
Great news! Dr Peter S. Park, an AI Safety postdoc at the Tegmark lab has agreed to be a judge!
Kabir Kumar
over 1 year ago
Excited to say that we have 20 participants for the critique-a-thon so far!
Kabir Kumar
over 1 year ago
1st to 2nd: Making a list of all the ways alignment plans could go wrong.
We'll put together a master list of potential "vulnerabilities" based on existing research and our own ideas. This will give us a checklist to use when evaluating plans.
3rd to 4th : Matching vulnerabilities to plans
Everyone will pick a few alignment plans to look at more closely. For each plan, you'll label up to 5 vulnerabilities you think could apply and point out evidence from the plan that supports them. Include your level of confidence in each label as a percentage.
5th to 8th : Argue for and against the vulnerabilities.
You'll team up with another participant and take turns, with one defending, the other questioning the vulnerabilities suggested in Step 2. This debate format will help strengthen the critiques. We'll swap sides on the 6th and rotate team member on the 8th.
9th to 10th: Provide feedback on each other's arguments.
Review your partner's reasoning for and against the vulnerability labels. Point out any faulty logic, questionable assumptions, lack of evidence, etc. to improve the critiques.
Step 5- one week of judging:
We'll evaluate submissions and award prizes!
The organizers and outside experts will judge all the critiques based on accuracy, evidence, insight, and communication. Cash prizes will go to the standout critiques that demonstrate top-notch critical analysis
Anton Makiievskyi
over 1 year ago
I think it would make sense to have a higher goal for prizes actually
Kabir Kumar
over 1 year ago
We've already got 13+ attendees with no prize at all and I want to maximize the chances of there being a prize.
Kabir Kumar
over 1 year ago
But if folks want to add more, I'd be happy to increase the prize pool. Though, at some point, it might make more sense to pay the researchers who're being judges.
Anton Makiievskyi
over 1 year ago
@KabirKumar kudos for the initiative! I hope you'll get a lot of high quality feedback and engagement from AI Safety community. I think it should be an easy "approve" for the regrantors
Kabir Kumar
over 1 year ago
Thank you!!
It helps that one of the consultants on our team is a highly experienced cybersecurity professional and professor.
Also, I kinda love breaking things and alignment plans are sooo vulnerable!
Kabir Kumar
over 1 year ago
Excited to say that within hours of announcement, we already have 10 people who've joined the critique-a-thon!