Colliderbench: AI Benchmark for Reproducing LHC Analyses

Name: Colliderbench: AI Benchmark for Reproducing LHC Analyses
Creator: Dariusfar
Published: 2026-05-12T18:52:38
Keywords: Llm Benchmark, Cern Lhc, Benchmark, Text, Particle Physics, Scientific Reasoning, Synthetic

by DariusfarUpdated 1mo ago

Available on 1 platform

Sign in to view source links and access this dataset

Description

Collider-Bench is an AI benchmark for evaluating LLM agents on their ability to reproduce experimental analyses from the Large Hadron Collider at CERN. The benchmark uses public papers and open scientific software to test multi-step scientific reasoning by autonomous coding agents. The dataset was authored by Dariusfar and last updated on 2026-05-13.

Use Cases

Benchmarking LLM agents on multi-step scientific reasoning based on tasks described in the dataset.
Evaluating AI's ability to read published CMS or ATLAS searches and identify relevant signal regions.
Testing autonomous agents on generating and processing simulated signal events for LHC analyses.
Assessing AI performance on implementing event selection and predicting binned signals from scientific papers.

Strengths

Specifically designed for evaluating autonomous coding agents on complex, multi-step scientific tasks.
Based on real-world public papers and open scientific software from CERN's LHC experiments.
Last updated on 2026-05-13, indicating recent maintenance.

Limitations

Column-level documentation is absent; field semantics must be inferred after download.
Row count is unknown, which may limit suitability assessment.
Description metadata is limited; actual data quality requires manual inspection after download.

Provenance

Source: Dariusfar
Collection Method: Likely compiled from public CMS and ATLAS papers and associated open scientific software.
Freshness: Last updated 2026-05-13 04:28:32; freshness should be verified.

Text Llm Benchmark Cern Lhc Benchmark Particle Physics Scientific Reasoning Synthetic

Related Datasets

Quality Score

D37

Description

39

Source

36

Reputation

43

Access

26

Community

145 downloads

1 likes

0 views

Dataset Info

Author: Dariusfar
Created: May 12, 2026
Updated: May 13, 2026
Last synced: May 21, 2026

Access

26

Community

145 downloads

1 likes

0 views

Dataset Info

Author: Dariusfar
Created: May 12, 2026
Updated: May 13, 2026
Last synced: May 21, 2026

Colliderbench: AI Benchmark for Reproducing LHC Analyses

Description

Use Cases

Strengths

Limitations

Provenance

Related Topics

Related Datasets

Quality Score

Community

Dataset Info

Community

Dataset Info