OmicBench is a 44-task benchmark anchored in biology for evaluating LLM coding agents on end-to-end multi-omics analysis workflows. Each task specifies a scientific objective and a storage target, with grading based on biology-anchored numerical criteria. The dataset was created by omicverse and last updated on 2026-05-20.
Use Cases
- Benchmarking LLM coding agents on multi-omics workflows based on the 44 defined tasks.
- Evaluating AI performance on biological data analysis objectives based on the described scientific goals.
- Testing agent capabilities for data manipulation and storage targeting based on the AnnData/MuData container specifications.
Strengths
- Contains 44 distinct evaluation tasks.
- Tasks are graded against biology-anchored numerical criteria.
- Focuses on end-to-end multi-omics analysis workflows.
Limitations
- Column-level documentation is absent; field semantics must be inferred after download.
- Row count is unknown, which may limit suitability assessment.
- Description metadata is limited; actual data quality requires manual inspection after download.
Provenance
- Source
- omicverse
- Freshness
- Last updated 2026-05-20 13:24:41; freshness should be verified.