@NeelNanda Great point, agreed. We may eventually need to narrow down our missions if we feel overextended. However, for now it seems worthwhile to explore multiple approaches. This will give us a better sense of which initiatives are most impactful and which can be dropped if necessary.
We decided to engage on these activities because 1) they all seem pretty important if we want to nudge France towards more safety, 2) they are complementary (more below), and 3) we have the support of 15+ dedicated volunteers, which gives us more bandwidth to pursue several missions. To give more detail for each activity:
Field-building: we are confident we can continue doing a good job here because we have been doing this for two years at EffiSciences. The pipeline is now working well and we know how much time it requires from our core team. These field-building activities help us recruit volunteers and seem very much worth keeping.
R&D: we have a few strong technical people who are focused on a specific project — the Benchmarks for the Evaluation of LLM Safeguards (BELLS) mentioned in this page —, which doesn't require many people to produce useful outputs. This project can be scaled by onboarding new people to work on additional benchmarks for various threat models, but if we are too constrained on funding or staff, we will just keep the focus on a few benchmarks. This R&D work is instrumental to policy outreach by helping us make connexions with public institutions that are interested in our tools and expertise.
Advocacy:
Policy-wise, we have already started discussing with a few public institutions to collaborate on our technical work, but we have yet to engage on advocacy per se. This will be a mission of the policy role we are recruiting for.
Regarding public awareness, we have a small team consisting of one staff member at approximately 0.5 FTE and 3-4 volunteers. They have listed various strategies to implement, some of which we have already begun, such as collaborating with journalists, publishing our newsletter, and posting on social media. In the coming weeks, we should gain a clearer understanding of the effectiveness of each approach. Here again, if we feel spread too thin, we could scale back some activities