Loading...
Loading...
Available on 1 platform
Sign in to view source links and access this dataset
ViRL39K contains 38,870 verifiable question-answer pairs designed for Vision-Language Reinforcement Learning training, released by TIGER-Lab in April 2025. It aggregates and refines data from seven specialized sources, including Llava-OneVision, MM-Math, and DeepScaleR, through a process of cleaning, reformatting, and verification.
The dataset serves as the foundation for the VL-Rethinker model; users should refer to Arxiv 2504.08837 for specific details on the verification logic used to ensure QA accuracy for RL training.