Loading...
Loading...
Available on 1 platform
Sign in to view source links and access this dataset
Metis-RL contains approximately 5.2 thousand multimodal prompts designed for training agentic models. The dataset was created by Accio-Lab for reinforcement learning via Hierarchical Decoupled Policy Optimization and was last updated in April 2026. It focuses on cultivating meta-cognitive tool use across perception, search, and mathematical or logical reasoning tasks.
License restrictions are unknown and should be verified before use.