ARC-AGI-2 is a dataset for artificial general intelligence (AGI) benchmarking, derived from a GitHub repository. The original source contains 1000 training examples and 120 test examples, which were flattened to account for files containing multiple tests. The dataset was uploaded by 'sirorezka' and last updated on Hugging Face in March 2026.
Use Cases
- Benchmarking AI reasoning capabilities based on the described AGI test examples
- Training models for abstract problem-solving based on the dataset's focus on reasoning tasks
- Evaluating model generalization on unseen reasoning problems based on the separate train and test splits
Strengths
- Derived from a known AGI benchmark repository (ARC-AGI-2)
- Contains a defined split of 1000 training and 120 test examples
- Last updated on the platform in March 2026
Limitations
- Column-level documentation is absent; field semantics must be inferred after download
- Row count is unknown, which may limit suitability assessment
- Description metadata is limited; actual data quality requires manual inspection after download
Provenance
- Source
- https://github.com/arcprize/ARC-AGI-2/
- Collection Method
- Flattened from GitHub repository commits
- Time Range
- Commit version from May 16, 2025
- Freshness
- Last updated 2026-03-13 15:14:31
- Geography
- null