Name: Combined Reasoning and Thinking Dataset for RL and SFT Training
Creator: comoZ
Published: 2025-12-30T15:00:55
Keywords: English, Mathematics, Computer Programming, Text, Reinforcement Learning

Description

comoZ's Reasoning Dataset is a compiled collection for training reasoning models, containing RL and SFT subsets. The RL subset provides high-quality ground truth pairs with task_type and rubrics for reward modeling. The SFT subset offers instruction-following data with tags to model thinking processes.

Use Cases

Train reward models using the RL subset's ground truth pairs and associated rubrics.
Fine-tune language models for instruction-following using the SFT subset's data with tags.
Develop models for System 2 thinking across tasks categorized by the task_type field.
Augment reasoning training pipelines by combining the RL subset's structured pairs with the SFT subset's process data.

Strengths

Contains two distinct, purpose-built subsets for RL and SFT training methodologies.
Data is compiled from various sources covering reasoning, math, coding, and creative writing domains.
Includes structured elements like task_type, rubrics, and tags to guide model training.

Limitations

Specific scale metrics like row count, column count, and dataset size are unknown.
The compilation nature may introduce inconsistencies in formatting or quality across source datasets.
Lack of sample data or detailed column descriptions makes precise data assessment difficult.

Provenance

Source: huggingface dataset by author comoZ.
Collection Method: Compiled collection from various reasoning, math, coding, and creative writing datasets.
Time Range: null
Freshness: Last updated on 2026-01 13.
Geography: null

The full description is hosted externally; users must visit the provided URL for complete details. License information is unknown.

Text English Mathematics Computer Programming Reinforcement Learning

Combined Reasoning and Thinking Dataset for RL and SFT Training

Description

Use Cases

Strengths

Limitations

Provenance

Related Topics

Related Datasets

Quality Score

Community

Dataset Info

Community

Dataset Info