BioAgent: Verified Agentic AI for Computational Biology

Project summary

Many biomedical researchers have questions that could be answered with existing public data, but cannot easily turn those questions into reliable computational analyses. The bottleneck is not always the absence of data. Often, it is the gap between biological intuition and the technical work required to find datasets, write code, run pipelines, debug errors, interpret outputs, and know whether the result is trustworthy.

BioAgent is an attempt to reduce that bottleneck.

I am building an open-source prototype of an agentic AI assistant for computational biology workflows. The first version will focus on one narrow, testable use case, helping researchers work with public biomedical datasets by retrieving relevant data, generating reproducible analysis code, running that code in a controlled environment, checking outputs for obvious failures, and producing a transparent report with sources, assumptions, limitations, and next steps.

The project is deliberately scoped as a pilot, not as a claim that AI can replace computational biologists. In scientific work, a plausible answer is not enough. The system needs to show where its data came from, what code it ran, what assumptions it made, what failed, and what should still be checked by a human. The goal is to build a small but serious proof-of-concept for agentic AI as scientific infrastructure: not a chatbot that sounds confident, but a workflow system that can be inspected, reproduced, and criticized.

If successful, BioAgent could help students, wet-lab researchers, independent researchers, and small biomedical teams access computational biology workflows that would otherwise require specialized support. If unsuccessful, the project should still produce a useful public record of where current agentic AI systems break down when applied to real scientific analysis.

What are this project's goals? How will you achieve them?

The goal of BioAgent is to test whether an AI agent can reliably assist with a narrow computational biology workflow while preserving reproducibility, transparency, and human oversight.

The project has four main goals.

1. Build a working end-to-end prototype

The prototype should be able to take a biological question or dataset reference, identify relevant public data sources, generate analysis code, run the code in a sandboxed environment, and produce a structured output. The first workflow will likely focus on public transcriptomics or another similarly well-scoped biomedical data task, because these workflows are common enough to matter and structured enough to evaluate.

2. Make the output reproducible

The system should not only return a written answer. It should produce artifacts that a researcher can inspect: code, notebooks or scripts, data source links, intermediate outputs, plots, and a final report. A user should be able to see what the agent did rather than only trusting what it says.

3. Add a verification layer

The most important part of this project is not the language model. It is the verification structure around it. BioAgent will include checks for execution errors, missing data, inconsistent outputs, unsupported claims, and places where the model should defer to human review. The system should be able to say “I cannot verify this” rather than forcing a confident answer.

4. Test the system with real users

I will test the prototype with a small number of students, biomedical researchers, or technically adjacent users. The evaluation will focus on practical questions: Did the system save time? Did it produce understandable outputs? Where did it fail? Which parts required human correction? Would a researcher use this again?

I will achieve these goals by keeping the first version narrow. Computational biology is too broad for a first grant-funded prototype. The project will therefore prioritize one workflow, a small number of public data sources, and a clear evaluation write-up over a large but shallow demo.

Theory of victory

Computational biology is becoming increasingly important, but the ability to use it remains unevenly distributed. Many researchers can ask good biological questions but cannot easily execute the computational work needed to answer them. At the same time, general AI tools can now write code and explain methods, but they often lack the structure needed for scientific reliability: provenance, execution, verification, and explicit uncertainty.

BioAgent’s theory of victory is that the useful version of agentic AI in science will not be a general “AI scientist” that claims to do everything. It will be a constrained workflow assistant that makes specific scientific tasks easier, while making its reasoning and execution visible enough for humans to audit.

If BioAgent can complete one narrow computational biology workflow with transparent outputs and documented failure modes, it would create evidence for a broader direction: agentic AI systems that expand access to scientific analysis without hiding the uncertainty and fragility of the process.

How will this funding be used?

I am requesting funding to build and evaluate a first public prototype.

The minimum funding would allow me to build the core version of the system: one narrow workflow, basic public-data retrieval, code generation, sandboxed execution, and a simple verification report.

The full funding goal would allow me to make the prototype substantially more useful by improving reliability, adding stronger evaluation, testing with users, and getting feedback from people with computational biology experience.

Planned use of funds:

Compute and model/API usage - $3,000
For agent runs, code generation, workflow testing, and repeated evaluation.
Cloud infrastructure, storage, and sandboxing - $2,000
For safely running generated code, storing outputs, and maintaining a reproducible execution environment.
Founder development time - $6,000
To support concentrated work on the prototype, integrations, interface, documentation, and evaluation.
Computational biology review / expert feedback - $1,500
To get feedback from people who can identify biological or methodological errors that a software-only evaluation would miss.
User testing and documentation - $1,000
For onboarding early users, collecting feedback, writing documentation, and preparing public materials.
Evaluation and safety work - $1,500
For designing tests, documenting failure modes, and adding safeguards around unsupported or potentially risky outputs.

Total funding goal: $15,000

With the minimum funding, I will prioritize a smaller but complete prototype. With the full funding goal, I will aim to produce a stronger public pilot, a working system, documentation, user feedback, and a written evaluation of what the tool can and cannot reliably do.

Who is on your team? What's your track record on similar projects?

I am currently building this as a solo founder.

My background is in software development, AI, and technical product-building. I have experience working with Python, web development, machine learning, and building technical projects from idea to usable product. My role in this project is to build the agent architecture, data/tool integrations, sandboxed execution environment, user interface, and evaluation pipeline.

I am not presenting myself as already having the full judgment of a computational biologist. That is part of why this project is structured as a narrow, reviewable pilot. The technical system should be built in a way that allows domain experts to inspect, criticize, and improve it. A major part of the work will be seeking feedback from computational biology users or reviewers so the project does not become a technically impressive but scientifically unreliable demo.

My comparative advantage is building quickly, understanding AI systems, and turning broad technical ideas into working prototypes. The main risk is domain correctness, and I plan to address that by narrowing the first workflow, using public data, keeping outputs auditable, and involving external review where possible.

What are the most likely causes and outcomes if this project fails?

The most likely cause of failure is that the agent is not reliable enough for scientific use. It may generate code that runs but produces weak conclusions, mishandle data formats, miss important assumptions, or produce outputs that look polished but are not biologically meaningful.

A second failure mode is scope creep. Computational biology is too broad to automate all at once. If I try to support too many workflows, organisms, databases, and analysis types in the first version, the result will likely be a shallow demo rather than a useful tool.

A third failure mode is weak user demand. Researchers may prefer existing tools, human bioinformaticians, or manual workflows if BioAgent does not save enough time or produce outputs they can trust.

A fourth failure mode is verification difficulty. It may be possible to check whether code executed, but much harder to check whether the scientific interpretation is correct. This is why the project will focus on transparent reports and human-reviewable artifacts rather than pretending that automated verification solves the whole problem.

A fifth failure mode is dual-use concern. Tools that automate biological analysis can lower barriers in ways that are not always beneficial. This pilot will therefore focus on public datasets, computational analysis, transparent logging, and narrow workflows rather than wet-lab protocol generation or anything that would directly enable harmful biological work.

If the project fails, the most likely outcome is a public write-up showing where the system broke down and what kinds of computational biology tasks are still too fragile for current agentic AI. That would still be valuable. It would help distinguish realistic near-term AI-for-science infrastructure from vague claims about autonomous AI scientists.

If the project succeeds, the outcome is a working prototype and evidence that a constrained, verified agent can help more people access computational biology workflows safely and reproducibly.

How much money have you raised in the last 12 months, and from where?

I have raised $1500 for this project in the last 12 months.