Manifund foxManifund
Home
Login
About
People
Categories
Newsletter
HomeAboutPeopleCategoriesLoginCreate
🥭
🥭
Thane Ruthenis

@Thane-Ruthenis

https://www.lesswrong.com/users/thane-ruthenis
$0total balance
$0charity balance
$0cash balance

$0 in pending offers

Projects

Synthesizing Standalone World-Models

pending admin approval

Comments

Synthesizing Standalone World-Models
🥭

Thane Ruthenis

15 days ago

@RyanKidd:

What are the largest sources of uncertainty for this project, or reasons it might fail to have impact?

The main failure modes are:

  • Failure of the core idea: that we could recover an interpretable model of the world by some easy-to-specify training/search process.

    • The underlying assumptions seem fairly solid to me:

      • We have many sources of evidence, empirical and theoretical, that the universe has an "observer-independent" well-abstracting structure.

      • The property of being "easy-to-interpret" is not inherently "human-laden". The process doesn't have to locate the human ontology and translate the world-model into it, it just has to structure the world-model in a fairly natural way that'd make it easy for us to untangle/study.

      • Approaching the problem via compression likewise seems well-founded, including empirical work pointing in that direction (see e. g. this and that).

    • That said, the project tackles some nontrivial theoretical problems, and it's possible that none of this works quite the right way.

  • Excessively long R&D time. Even if the core idea is correct, the time to flesh out the theory and cross the theory-practice gap may exceed the time it'd take others to achieve AGI by other means.

    • I aim to mitigate this by aggressively prioritizing R&D avenues that promise to connect to reality quickly, and by using existing well-developed methods to implement whichever parts of the project could be implemented that way, instead of reinventing the wheel in the name of theoretical elegance. (My current expectation is that all the tools are already there, theory is only needed to figure out how to arrange them the right way.)

    • In addition, as I'd mentioned, worlds in which AI progress is rapid are worlds in which we can probably use that progress' byproducts (e. g., superhuman-mathematician LLMs) to dramatically speed up research on this project as well. On the other hand, worlds in which AI progress isn't as rapid as the industry hopes are worlds in which we can afford the longer development time.

    • It's still possible it'd all take longer than expected.

What other sources of funding have you sought?

I've also applied to the LTFF. (Though I expect it'd take a couple months to hear back, and my application prioritized smaller-but-more-certain funding, so I don't expect to be exhaustively funded by it.)