Loading...
Loading...
Available on 1 platform
Sign in to view source links and access this dataset
1,800 rollout records profiling the frontier difficulty of 900 ShoppingBench tasks using the GPT-OSS-120B model. Created by Jarrodbarnes via the Dynamical environment factory in February 2026, it includes 100 teacher-guided mutations designed to maximize learning signals for reinforcement learning. The data identifies specific 'hillclimbable' tasks where model improvement is most likely.
Released under the MIT license; requires Parquet-compatible tools like Polars or Pandas for processing.