Enabling a Student to Present his Mechanistic Interpretability Work at NeurIPS

Project Summary
We are requesting support to enable Viacheslav, the first author of our recently accepted NeurIPS 2025 mechanistic interpretability paper “One-step is enough: Sparse autoencoders for text-to-image diffusion models”, to present this work at the conference. This research pioneers the use of sparse autoencoders to uncover interpretable features in text-to-image diffusion models, enabling new ways to understand and control these systems. This contribution represents one of the first systematic mechanistic interpretability studies on SDXL and FLUX.

Context
The project began as an internship and faced multiple rounds of review and improvement. Our first submissions to ICLR, CVPR, and ICML were rejected, but Viacheslav continued to develop the work far beyond his internship – refining the methods, strengthening the results, mentoring students to continue his work, and helping bring it to publication. After a challenging rebuttal phase, the paper was accepted at NeurIPS. This is a significant milestone for both our team and the broader vision interpretability community.

As a postdoctoral researcher, I do not have independent funding for travel, and the affiliated labs do not have remaining budgets to fully fund Viacheslav’s conference trip for this project. However, I have secured partial support. Manifund support would allow us to fully close the remaining funding gap and ensure that Viacheslav, a young and aspiring mechanistic interpretability researcher, can represent our work in person.

Project Goals
All research goals have been achieved; the remaining goal is to ensure that the lead student can present the work in person. Attending NeurIPS will allow him to share results with the community, build collaborations, and represent our lab’s contribution to advancing interpretability research in generative models. It will be an important experience for him to learn about the academic landscape, especially surrounding the field of mechanistic interpretability and form valuable connections for his future research journey as an aspiring mechanistic interpretability researcher.

Budget

Flight: $900
Hotel: $600
Registration: $200
Food & local transport: $200
Total: $1,900

Already secured partial funding: $800

Requested from Manifund

$1,100

Why This Matters
This small travel grant will have an outsized impact: it will enable an early-career mechanistic interpretability researcher to present at the world’s leading ML conference, foster visibility for interpretability research, and help build the community around this emerging area.