Manifund foxManifund
Home
Login
About
People
Categories
Newsletter
HomeAboutPeopleCategoriesLoginCreate
GarretteBaker avatarGarretteBaker avatar
Garrett Baker

@GarretteBaker

I'm an independent alignment researcher, self-taught in machine learning, convex optimization, and probability theory

https://github.com/GarretteBaker/
$0total balance
$0charity balance
$0cash balance

$0 in pending offers

About Me

For approximately the past year, I’ve been doing alignment research full-time, working on a variety of approaches, and trying to understand the problem in-depth enough to invent new ones. If funded, I plan to continue doing approximately the same work as before, which has historically been scalable mechanistic interpretability, formal and prosaic corrigibility, reflective stability, and a bunch of value theory stuff. Along with lots of upskilling in convex optimization, machine learning, neuroscience, and economics.

My current project is an attempt to connect the tools & theory of singular learning theory with our knowledge of the inductive biases and loss landscapes of large language models.

Projects

Garrett Baker salary to study the development of values of RL agents over time

Outgoing donations

AI Safety Reading Group at metauni [Retrospective]
$10
8 months ago
Act I: Exploring emergent behavior from multi-AI, multi-human interaction
$96
8 months ago
Act I: Exploring emergent behavior from multi-AI, multi-human interaction
$50
8 months ago
Lightcone Infrastructure
$95
8 months ago
Next Steps in Developmental Interpretability
$200
9 months ago
Lightcone Infrastructure
$50
9 months ago

Comments

Act I: Exploring emergent behavior from multi-AI, multi-human interaction
GarretteBaker avatar

Garrett Baker

9 months ago

I have seen some of amp's work, and it is pretty interesting, and novel in the grand scheme of things

🧡
Lightcone Infrastructure
GarretteBaker avatar

Garrett Baker

9 months ago

Lightcone consistently does quality things.

Garrett Baker salary to study the development of values of RL agents over time
GarretteBaker avatar

Garrett Baker

10 months ago

@Austin Here is the LW post: https://www.lesswrong.com/posts/Bczmi8vjiugDRec7C/what-and-why-developmental-interpretability-of-reinforcement

Transactions

ForDateTypeAmount
AI Safety Reading Group at metauni [Retrospective]8 months agoproject donation10
Act I: Exploring emergent behavior from multi-AI, multi-human interaction8 months agoproject donation96
Act I: Exploring emergent behavior from multi-AI, multi-human interaction8 months agoproject donation50
Lightcone Infrastructure8 months agoproject donation95
<176bd26d-9db4-4c7a-98c0-ba65570fb44c>9 months agotip+1
Next Steps in Developmental Interpretability9 months agoproject donation200
Lightcone Infrastructure9 months agoproject donation50
Manifund Bank9 months agodeposit+500