Modeling Automated AI R&D

Project summary

At some point programmers will be able to spin up a fleet of thousands of world class automated researchers—the resulting progress accomplished by this researcher cohort could be immense. Future poorly controlled self-improvement is probably the biggest issue in AI development and has the least amount of work on thinking about how to control it. This research project will use agent-based models to model the process of automated AI R&D / uncontrolled self-improvement (USI). The goal is to better understand potential runaway AI R&D processes and how to make these processes more controllable. Paul Salmon is a highly-cited expert on complex systems, human factors, and sociotechnical systems. He has also previously published work on how human factors can be used to manage the risks associated with AGI. Basically for this problem we don’t need extreme ML knowledge; instead we need knowledge of how to model systems and their dynamics and ways they can go awry.

What are this project's goals and how they be achieved?

The goal is to better understand potential runaway AI R&D processes and how to make them more controllable. It will create basic models of the dynamics and interactions between interventions that attempt to make this process more controllable.

How will this funding be used?

This funding will be used to pay graduate students who will help with the modeling.

Who is on the team and what's their track record on similar projects?

Paul Salmon is a professor at the University of the Sunshine Coast. He is a highly-cited expert on complex systems, human factors, and sociotechnical systems (google scholar). He has become interested in AGI risk and has previously published work on how human factors can be used to manage the risks associated with AGI.

What are the most likely causes and outcomes if this project fails? (premortem)

There is a chance that modeling runaway AI R&D processes requires a different skillset or is more complicated/intractable than initially expected. If that happens, the model may produce less interesting results or explain less than one might hope.

What other funding is this person or project getting?

None that I am aware of.