Name: ChartVerse-SFT-600K: Large-Scale Chart Reasoning Dataset with Chain-of-Thought Annotations
Creator: opendatalab
Published: 2026-01-19T07:22:54
Keywords: Chart Reasoning, Chain Of Thought, Vision Language, Sft, Large Scale, Multimodal

Description

ChartVerse-SFT-600K contains 600,000 high-quality samples for chart reasoning, each annotated with a Chain-of-Thought (CoT) rationale. The dataset was developed by opendatalab as part of the ChartVerse project and was last updated on January 23, 2026. It is filtered to exclude trivial samples, ensuring every entry provides a meaningful learning signal for model training.

Use Cases

Training models for chart-based question answering based on the described chart reasoning tasks.
Fine-tuning models to generate step-by-step reasoning (Chain-of-Thought) based on the CoT annotations.
Benchmarking model performance on non-trivial visual reasoning problems based on the filtered samples.
Developing instruction-following capabilities for multimodal AI based on the supervised fine-tuning (SFT) nature of the dataset.

Strengths

600,000 samples provide a large-scale resource for training.
Chain-of-Thought annotations are included for each sample.
Samples are filtered by failure rate (r > 0) to ensure non-trivial learning challenges.

Limitations

Column-level documentation is absent; field semantics must be inferred after download.
Row count is unknown, which may limit suitability assessment.
Data may reflect bias inherent to the source collection and annotation process.

Provenance

Source: opendatalab/ChartVerse project on Hugging Face.
Collection Method: Developed as part of the ChartVerse project; details on method are referenced to the project page.
Time Range: null
Freshness: Last updated 2026-01-23 03:20:07; freshness should be verified.
Geography: null

null

Multimodal Chart Reasoning Chain Of Thought Vision Language Sft Large Scale

ChartVerse-SFT-600K: Large-Scale Chart Reasoning Dataset with Chain-of-Thought Annotations

Description

Use Cases

Strengths

Limitations

Provenance

Related Topics

Related Datasets

Quality Score

Community

Dataset Info

Community

Dataset Info