ICML Workshop Travel: Cross-Constitution Drift in LLM Character Training

Project summary

I am requesting travel support to present my accepted paper, "Side Effects of Character Training: Quantifying Cross-Constitution Drift in LLMs," at the ICML 2026 Pluralistic Alignment Workshop in Seoul. Paper Link : https://www.sauravpanigrahi.com/artifact/side-effects-character-training.pdf

The project studies a concrete alignment risk : optimizing a model for one target persona can unintentionally shift its behavior across unrelated value dimensions, which can produce false confidence if evaluations are only one-dimensional.

Workshop registration is time-sensitive and may close before the event date, so early funding is critical to secure my slot.

What are this project's goals? How will you achieve them?
This project has two goals:

1. Present and disseminate our accepted results to the pluralistic alignment community at ICML.

2. Use direct feedback from researchers to improve the next phase of experiments and evaluation design.

3. Launch follow-up work on deeper ablations of cross-constitution drift under different training and deployment pressures.

Because this is a workshop paper in a fast-moving and still-nascent area, community critique is especially valuable for deciding which hypotheses and interventions are most promising next. I will achieve these goals by attending in person, presenting the paper, collecting targeted technical feedback, and converting that feedback into a prioritized follow-up experiment plan.

How will this funding be used?

The funding covers lean workshop travel costs:

- $550: workshop-only conference registration

- $700: round-trip flights (India to Seoul)

- $400: accommodation near venue (3 nights)

- $280: meals and local transport (4 days)

- $100: South Korea visa fees

Who is on your team? What's your track record on similar projects?

This research was part of SPAR fellowship. Our paper has already been accepted to the ICML 2026 Pluralistic Alignment Workshop, which provides external validation.

Technically, we trained 11 character variants using the Open Character Training pipeline on Qwen-2.5-7B-Instruct, then evaluated those models across 11 constitutions using EigenBench on 200 AIRiskDilemmas scenarios. This produced a full character-train matrix that surfaced target effects and side-effects.

What are the most likely causes and outcomes if this project fails?

- Travel/logistics constraints prevent in-person attendance.
Most likely outcomes under failure:

- Best case: remote presentation (if available), The organizers haven't mentioned on the page and therefore its most likely that it isn't available.

- Worst case: inability to present, delaying research feedback

How much money have you raised in the last 12 months, and from where?

The underlying research was completed through SPAR Fellowship collaboration (which did not include travel funding)