Sign in to view source links and access this dataset
Description
AutoMat packages published scientific statements with the inputs, papers, and reference outputs needed for reproduction attempts. Each claim includes authoritative metadata such as claim ID, paper, and DOI. The benchmark was created by author sheepy928 and was last updated on June 15, 2026.
Use Cases
Benchmarking autonomous agents on their ability to reproduce published scientific findings based on provided claims and materials.
Evaluating the reproducibility of computational materials science literature based on the structured claims.
Training AI systems for scientific reasoning using the packaged inputs and reference outputs.
Analyzing trends in scientific claims and their reproducibility across the materials science domain.
Strengths
Claims are structured with inputs, papers, and reference outputs, providing a complete package for reproduction.
Includes authoritative metadata (claim ID, paper, DOI) for provenance tracking.
Last updated on June 15, 2026, suggesting recent maintenance.
Limitations
Column-level documentation is absent; field semantics must be inferred after download.
Row count is unknown, which may limit suitability assessment.
The description metadata is limited; actual data quality requires manual inspection after download.
Provenance
Source
Claims are drawn from the computational materials-science literature.
Collection Method
Likely manually curated and packaged from published papers.
Freshness
Last updated 2026-06-15 10:18:15
License is unknown; users should verify licensing terms before use.