Loading...
Loading...
Available on 1 platform
Sign in to view source links and access this dataset
Skywork-OR1-RL-Data is a reinforcement learning training dataset containing between 100,000 and 1,000,000 text records released by Skywork in April 2025. The collection features problems categorized by difficulty levels ranging from 0 to 16, calibrated against specific DeepSeek-R1-Distill-Qwen model variants.
Difficulty filtering is specific to DeepSeek-R1-Distill-Qwen-1.5B, 7B, and 32B; refer to Arxiv 250522312 for the full methodology.