Loading...
Loading...
Available on 1 platform
Sign in to view source links and access this dataset
A 2026 benchmark from NVIDIA provides a sandbox environment for evaluating multi-step agentic task execution. It contains 26 tools and 690 tasks simulating common business activities like sending emails and scheduling meetings. The dataset is part of the NVIDIA NeMo Gym framework for building reinforcement learning environments.
The full description and potential license details are on the Hugging Face dataset page; review before use. The dataset is designed for integration with the NVIDIA NeMo Gym framework.