You're pledging to donate if the project hits its minimum goal and gets approved. If not, your funds will be returned.
Project summary. We propose a six-month research project to build Φ-Arena, a public benchmark for adversarial multi-agent vision-language-action (VLA) evaluation, and to conduct two associated research papers targeting ICLR 2027. Existing VLA benchmarks (LIBERO, RoboCasa, VLA-Risk) evaluate single agents in cooperative or static environments. Existing multi-agent benchmarks (Melting Pot, PettingZoo, MARL-Lib) work in low-dimensional symbolic domains. The intersection — physical-world VLA-vs-VLA under contact-rich dynamics — has no public substrate. We will build it, and use it to study two specific research questions: (1) what circuit-level features distinguish exploit-prone subnetworks in self-play policies, and (2) how energy-bounded physical constraints (torque budget, battery, episode length) reshape emergent strategy distributions under adversarial training. In six months, we expect to have shipped an open benchmark (code, datasets, leaderboard) and two ICLR 2027 submissions on the above questions. If successful, we expect adversarial multi-agent embodied control to become an active subfield of physical-AI safety research over the next two to three years, with Φ-Arena as its substrate. Manifesto and research directions: https://xn--7xa.monster
Goals & how. We will ship three deliverables for ICLR 2027 (September 2026 submission deadline): (1) Φ-Arena (co-led). Open benchmark for VLA-vs-VLA evaluation across MuJoCo, Isaac Sim, and Genesis. Standardized opponent-conditioned protocols. Energy-bounded constraint regimes. Matchup table of OpenVLA, OpenVLA-OFT, π₀, and SmolVLA. Released open-source under a permissive license alongside a HuggingFace dataset of rollout traces. (2) Mechanistic interpretability of self-play policies (Liu Yuchen lead). Circuit-level analysis of exploit-prone subnetworks emerging during adversarial self-play training. Extends activation-intervention methods (matched-random ablation, bypass testing, attention-partition diagnostics) and sparse-autoencoder approaches (per SAEBench, ICML 2025) from static language models to dynamic embodied policies. (3) Energy-bounded adversarial games (Han Muchen lead). Empirical characterization of how hard physical resource constraints reshape emergent strategy distributions in self-play, compared to unbounded regimes. The unit of work is papers, submitted to ICLR 2027 with all code, models, and datasets released open-source at submission time. Falsifiable success metric: at least three external research groups extend or build on Φ-Arena within six months of public release.
How funding used. $7,000 — compute beyond Tier 0 grants. $3,500 — ICLR 2027 conference attendance. $2,500 — open-source infrastructure (HuggingFace Pro, leaderboard backend, storage). $2,000 — operational (Wyoming LLC annual report, Form 1065 first-year CPA, domain renewal). $15,000 — total. $5K minimum funds the Φ-Arena population phase (the substrate the other two papers depend on). $10K adds the mechinterp paper. $15K full funds three papers plus ICLR travel.
Team / track record. Liu Yuchen (founder; HKUST EE + AI dual major, sophomore). Four papers under double-blind review at NeurIPS 2026 (three sole-author, one co-first), all on mechanistic interpretability and methodology for VLAs: Spectral diagnostics bias in transformer hidden states (sole-author; anonymous repo released); Activation-ablation methodology for VLA models, with closed-loop LIBERO validation (sole-author; anonymous repo released); Cross-view attention partition framework for VLAs (sole-author; anonymous repo released); Deployment-failure audit suite for VLAs, released as a CLI tool (co-first author; anonymous repo released). Titles and anonymous repository URLs withheld until December 2026 per NeurIPS double-blind policy. Review-safe paper descriptions and CV at https://lyrica.xn--7xa.monster. The methodological toolkit Φ depends on — forward-hook activation interventions, closed-loop VLA rollouts in LIBERO/MuJoCo, deployment-quality failure analysis with Wilson confidence intervals and cluster bootstrap — is built and battle-tested in the prior papers above. Prior independent projects: Squirrel (AI memory layer for coding agents, Rust + MCP, archived at 1,000+ GitHub stars in Feb 2026), Xiaoniao (open-source cross-platform translation tool, Go), OfferI (AI study-abroad agent startup, Feb – Dec 2025). Han Muchen (founding researcher; HKUST sophomore). Prior research with Prof. Janet Hui-wen Hsiao at HKUST Division of Social Science on cognitive-science approaches to AI / explainable AI (EMHMM with deep learning). Φ public infrastructure (live as of May 2026): https://xn--7xa.monster · https://github.com/phi-monster · https://huggingface.co/phi-monster
Failure modes. Most likely failure: schedule. Two part-time student founders shipping three papers in four months is tight; HKUST coursework will eat calendar time. If we ship two of three: we drop the energy-bounded paper (its experimental scope is most flexible) and ship Φ-Arena and the mechinterp paper. Φ-Arena is the non-negotiable deliverable — it's the open public artifact this funding catalyzes. If we ship zero of three: we publish a postmortem on LessWrong about what we learned, and release the partial benchmark infrastructure (simulator adapters, opponent-conditioned eval harness, leaderboard backend) as open-source. The funding is not wasted in that scenario. Other failure modes: Compute overrun. Mitigation: scope to a two-VLA matchup; release smaller benchmark; plan v2 expansion. Self-play training instability (well-known multi-agent RL pathology). Mitigation: fall back to fixed opponent pool sampling instead of full population-based training. ICLR rejection (~50% baseline per paper at top venues). Outcome: resubmit to NeurIPS 2027 or ICML 2027. Φ-Arena substrate releases publicly at submission regardless of acceptance.
Money raised in last 12 months. $0. Φ(fight) Research was founded May 2026. Wyoming LLC (Phi Fight Research LLC) was filed on 2026-05-16, pending Wyoming Secretary of State approval (expected active by 2026-05-27). In flight (Tier 0 compute grants, both submitted 2026-05-16): TPU Research Cloud (free TPU credit; not cash); Lambda Research Grant (up to $5K compute credit; not cash). Founders are self-funded as HKUST students. The Manifund regrant is the first cash funding pursued for the LLC. Disbursement: recipient is Phi Fight Research LLC. EIN application by Form SS-4 fax following LLC active; Mercury bank account active mid-June 2026; wire-transfer instructions provided to Manifund once Mercury is active. Tax classification: partnership (multi-member LLC default). FinCEN BOI Report not required for U.S.-formed LLCs under the 2025-03-26 Interim Final Rule.