Please note that the defensive benchmark's website currently has three scenarios published, but there will be a fourth scenario – focused on LLMs' ability to mitigate 'active scanning' attacks – published in January '26.
@Alex-Leader
Principal Investigator on a defensive cybersecurity benchmark for LLMs, funded by Coefficient Giving. Looking to build a similar LLM benchmark for offensive cyber capabilities.
https://www.linkedin.com/in/-alex-leader/$0 in pending offers
I lead LLM research projects focused on security, safety, and dangerous capabilities evaluation. We're currently executing a $2.1M Coefficient Giving grant to build a standardized defensive cybersecurity benchmark, used by government AI agencies and frontier labs. Our technical partners bring decades of battlefield-tested cyberwarfare expertise to help us design operationally realistic scenarios. I'm also co-founder and COO of Spotlight Security, a separate commercial venture applying this expertise to critical infrastructure protection. On Manifund, I'm exploring funding for a follow-on offensive cyber kill chain benchmark to measure whether AI can autonomously execute multi-stage attacks.
Alex Leader
about 11 hours ago
Please note that the defensive benchmark's website currently has three scenarios published, but there will be a fourth scenario – focused on LLMs' ability to mitigate 'active scanning' attacks – published in January '26.
Alex Leader
about 11 hours ago
Our current defensive-focused benchmark can be viewed here: http://www.benchmark-spotlightsecurity.com/
If you are asked to submit log-in credentials, they are:
Username: admin-spot
Password: spotlight4lyf