DataSalon

Discover quality datasets for AI training — aggregated from 40+ platforms, curated by AI.

ProductSearch Datasets Browse Topics Rankings Community API / MCP

ResourcesDocumentation Blog Changelog Status

LegalPrivacy Policy Terms of Service Cookie Policy

FineProofs RL: 5,227 Olympiad Math Problems with RL Rubrics | DataSalon

Home Mathematics & StatisticsFineProofs RL: 5,227 Olympiad Math Problems with RL Rubrics

Mathematics & Statistics

FineProofs RL: 5,227 Olympiad Math Problems with RL Rubrics

Name: FineProofs RL: 5,227 Olympiad Math Problems with RL Rubrics
Creator: lm-provers
Published: 2026-02-12T08:58:04
Keywords: Size Categories1 Kn10 K, Librarypolars, OPTIMIZED-PARQUET, Modalitytext, Modalitytabular, Librarymlcroissant, Librarydatasets, Librarypandas, Parquet, Regionus, Arxiv251101846

by lm-provers·Updated 3mo ago

Available on 1 platform

Description

5,227 mathematical Olympiad problems and rubrics for reinforcement learning training, released by lm-provers in 2026. The collection includes problems from international competitions and Art of Problem Solving (AoPS) annotated with Gemini-3-Pro rubrics and Qwen-based rewards.

Use Cases

Training reward models using the 0-7 point rubrics
Fine-tuning LLMs for mathematical reasoning using per-rollout scores
Benchmarking AI performance on international Olympiad-level problems

Strengths

5,227 problems from international Olympiads and AoPS
Rubrics use a specific 0-7 point scale
Includes per-rollout scores from Qwen/Qwen3-4B-Thinking-2507

Limitations

Rubrics and scores are synthetic/model-generated rather than human-verified
Limited to 5,227 records, which may be small for some large-scale RL tasks

Provenance

Source: International Olympiad competitions and Art of Problem Solving (AoPS)
Collection Method: Sourced from competitions and AoPS, with synthetic annotations generated by Gemini-3-Pro and Qwen models
Freshness: Last updated February 2026.
Geography: Global

Associated with Arxiv paper 2511.01846; rubrics and scores are model-generated.

OPTIMIZED-PARQUET Parquet Size Categories1 Kn10 K Librarypolars Modalitytext Modalitytabular Librarymlcroissant Librarydatasets Librarypandas Regionus Arxiv251101846

Related Datasets

Quality Score

D39

Description

Source

Reputation

Quality Score

D39

Description

Source

Reputation

Access

Community

181 downloads

5 likes

0 views

Dataset Info

Author: lm-provers
Created: Feb 12, 2026
Updated: Feb 14, 2026
Last synced: Apr 13, 2026

Access

Community

181 downloads

5 likes

0 views

Dataset Info

Author: lm-provers
Created: Feb 12, 2026
Updated: Feb 14, 2026
Last synced: Apr 13, 2026

FineProofs RL: 5,227 Olympiad Math Problems with RL Rubrics

Description

Use Cases

Strengths

Limitations

Provenance

Related Topics

Related Datasets

Quality Score

Community

Dataset Info

Community

Dataset Info