Giving access to reasoning traces generated by Gemini-2.5-pro for the Robo2VLM-1 visual question answering benchmark. It contains logical, step-by-step explanations that justify correct answers for robotic manipulation tasks across diverse, in-the-wild environments.
Use Cases
- Train vision-language models to produce chain-of-thought reasoning using the provided reasoning traces.
- Improve the interpretability of robotic VQA systems by supervising the model on the provided logical steps.
- Distill reasoning capabilities from large frontier models into smaller, task-specific robotics models for real-time manipulation.
Strengths
- Features reasoning traces generated by the Gemini-2.5-pro model to support correct VQA answers.
- Built upon the Robo2VLM-1 dataset which focuses on large-scale in-the-wild robot manipulation.
- Includes visual question answering pairs paired with step-by-step logical justifications for robotic actions.