Use of funding
We are extremely grateful for the support we have received through manifund. Since creating this manifund project, our funding and organizational situation has been through quite the flux and is still not fully settled! In about 4-5 months, we should finally have some clarity over our funding situation which will tell us whether we will need lots of additional support or none, so fingers crossed!
Securing the manifund funding helped us with our budgeting, make the right trade-offs between money and time, and pursue moving the entire team to the bay area which we otherwise might have felt more hesitant to do. For logistical reasons, we haven't actually accessed the money yet to pay Emery's salary but with still intent to use this money for this purpose.
Object-level progress
On the object-level, here is what we have been up to since creating this project on manifund:
LLM conceptual reasoning abilities
We've honed in on a particular project: Measuring and improving the conceptual reasoning abilities of LLMs through elicitation. Conceptual reasoning is, roughly, reasoning in domains and about questions where we don't have (access to) ground truth. The prototypical example of this kind of domain is philosophy.
Many of our most promising theories of change involve doing acausally relevant research with the help of AI assistants. Much of such research is primarily conceptual. We want to make sure we can make use of actually useful AI assistants as early as possible, so we've started this project. We also believe conceptual research is differentially useful for many other AI safety applications compared to capabilities research and expect the project to have many positive externalities outside of the acausal agenda.
We have since built the, to our awareness, best dataset measuring conceptual reasoning ability which has already garnered much interest from Anthropic. You can read more about the dataset and how LLMs perform on it here. Our current immediate goal is to develop an LLM critic based on the dataset which can write high-quality conceptual critiques when given an argument. We think we are pretty close to something that can write high-quality LessWrong comments.
Making acausal research more accessible
We've been working on making acausal research more accessible. We've drafted 10-15 blogposts to explain the field which you can look forward to reading in a series in the hopefully not-too-far future.
Bringing the team together
Until now, our team worked remotely with Caspar and Emery working at CMU in Pittsburgh and Chi working for Polaris Research Institute in Berkeley. While we can't spill the details right now, we will hopefully be permanently in the same place going forward and are very excited about this.