Project summary
At AI-Plans.com, we are working on a contributable compendium of alignment plans and the criticisms against them. Currently, newcomers to the field of AI Alignment often struggle to understand what work is being done, who is doing it, and the assumptions, strengths, and weaknesses of each plan. We believe AI-plans.com will be an easy, centralized way to discover and learn more about the most promising alignment plans.
Project goals
The site is currently in Stage 1, where it is purely a compendium. We are in the process of adding up to 1000 plans and the criticisms made against them so far. Further plans and criticisms can be added by users.
Projected benefits of Stage 1:
This makes it easy to see what plans exist and what their most common problems are.
Stage 2:
A scoring system for criticisms and a ranking system for plans will be added. Plans will be ranked based on the total scores of their criticisms, with weighted votes given to users who submit higher scoring criticisms. Alignment researchers can link their AI-Plans account to research-relevant platforms for a slightly weighted vote. New plans start with 0 bounty, with lower bounty plans giving the most points, incentivizing users to write good criticisms.
The scoring system makes it easier to identify plans with the biggest holes and helps in levying evidence against problematic plans. Additionally, it allows scientists and engineers to see which companies have the least problematic plans, making it easier to decide where to work. The creator of aisafety.careers intends to integrate with the site.
Stage 3:
In addition to everything from Stage 1 and 2, there will be monthly cash prizes for the highest ranking plan and for the user/users with the most criticism points that month.
Projected benefits of Stage 3:
This supercharges everything from Stage 2 and attracts talented people who require a non-committal monetary incentive to start engaging with alignment research. This also provides a heuristic argument for the difficulty of the problem: "There is money on the table if anyone can come up with a plan with fewer problems, yet no one has done so!"
How will this funding be used?
20% - to the dev
10%- to the QA
20%- to moderators
20% - to the prize fund
15% - A lawyer to set up the non-profit, create term&conditions for the prize section at Stage 3 and help with what we need to do to comply with GDPR
15% - to the Red Team
How could this project be actively harmful?
If there is a plan at the top of the leaderboard that seems promising, someone with more money than than caution could see that and spend a lot of money to try to use that to make an ASI.
What other funding is this person or project getting?
None, so far. Applying to Lightspeed Grants.