ChartVerse-RL-40K is a curated dataset of the most challenging chart reasoning samples for Reinforcement Learning, developed by opendatalab. It contains samples with the highest failure rates, which strong Vision-Language Models struggle with but can still solve occasionally, providing a strong learning signal for RL training. The dataset was last updated on 2026-01-21.
Use Cases
- Training reinforcement learning agents for chart understanding based on high-failure-rate samples.
- Benchmarking the reasoning limits of Vision-Language Models based on difficult chart-based questions.
- Fine-tuning models on challenging visual reasoning tasks based on curated difficult samples.
Strengths
- Samples are curated for high difficulty, providing a strong learning signal for RL training.
- Dataset is part of the opendatalab/ChartVerse project, suggesting a structured development context.
Limitations
- Description metadata is limited; actual data quality requires manual inspection after download.
- Column-level documentation is absent; field semantics must be inferred after download.
- Row count and file formats are unknown, which may limit suitability assessment.
Provenance
- Source
- opendatalab/ChartVerse project on Hugging Face.
- Collection Method
- Curated from samples with the highest failure rates.
- Time Range
- null
- Freshness
- Last updated 2026-01-21 03:26:15; freshness should be verified.
- Geography
- null