Description of subprojects and results, including major changes from the original proposal
We mostly worked on a paper studying ways to improve unsupervised probing methods via clustering. Our paper got accepted to the MechInterp workshop at ICML. We have also submitted the paper to a top conference, and it is under review now.
SPAR mentor in 2024 spring iteration. We have been working on a project where we were using probing methods to elicit the value of state inside policy and value networks in reinforcement learning. Furthermore, three SPAR students also worked on the above paper mentioned in 1.
In addition, our lead researcher has been involved in multiple projects which also got accepted to the MechInterp workshop here and here
Spending breakdown
Since we got only a relatively small part, we were only able to cover costs during for our stay at FAR Labs in Berkeley for two people (flights, accommodation, food, etc.). They invited us for their team-in-residency program, where we worked mostly on the above things. Some costs were also used for compute, and the ICML conference.