DataSalon

Discover quality datasets for AI training — aggregated from 40+ platforms, curated by AI.

ProductSearch Datasets Browse Topics Rankings Community API / MCP

ResourcesDocumentation Blog Changelog Status

LegalPrivacy Policy Terms of Service Cookie Policy

Oro Shoppingbench Frontier: 1,800 RL Rollouts for GPT-OSS-120B | DataSalon

Home EducationOro Shoppingbench Frontier: 1,800 RL Rollouts for GPT-OSS-120B

Education

Oro Shoppingbench Frontier: 1,800 RL Rollouts for GPT-OSS-120B

Name: Oro Shoppingbench Frontier: 1,800 RL Rollouts for GPT-OSS-120B
Creator: Jarrodbarnes
Published: 2026-02-27T18:06:59
Keywords: Size Categories1 Kn10 K, Librarypolars, OPTIMIZED-PARQUET, Modalitytext, Modalitytabular, Librarymlcroissant, Librarydatasets, Librarypandas, Parquet, Frontier Profile, Regionus, Reinforcement Learning, Shoppingbench, Licensemit, Dynamical, Rl Post Training

by Jarrodbarnes·Updated 4mo ago

Available on 1 platform

Description

1,800 rollout records profiling the frontier difficulty of 900 ShoppingBench tasks using the GPT-OSS-120B model. Created by Jarrodbarnes via the Dynamical environment factory in February 2026, it includes 100 teacher-guided mutations designed to maximize learning signals for reinforcement learning. The data identifies specific 'hillclimbable' tasks where model improvement is most likely.

Use Cases

Benchmarking GPT-OSS-120B performance across 900 ShoppingBench tasks
Optimizing RL post-training using the 23 hillclimbable mutations
Analyzing frontier difficulty profiles to maximize learning signal per compute unit

Strengths

1,800 baseline rollouts across 900 distinct tasks
Includes 100 teacher-guided mutations for RL profiling
Uses a high-parameter GPT-OSS-120B model for difficulty assessment

Limitations

Small sample size of only 23 admitted hillclimbable mutations
Difficulty profiles are specific to the GPT-OSS-120B model architecture
Limited total record count of 1,800 rows

Provenance

Source: Jarrodbarnes via Dynamical environment factory
Collection Method: synthetic
Freshness: Last updated February 2026.

Released under the MIT license; requires Parquet-compatible tools like Polars or Pandas for processing.

OPTIMIZED-PARQUET Parquet Size Categories1 Kn10 K Librarypolars Modalitytext Modalitytabular Librarymlcroissant Librarydatasets Librarypandas Frontier Profile Regionus Reinforcement Learning Shoppingbench Licensemit Dynamical Rl Post Training

Related Datasets

Quality Score

D39

Description

Source

Reputation

Quality Score

D39

Description

Source

Reputation

Access

Community

27 downloads

1 likes

0 views

Dataset Info

Author: Jarrodbarnes
Created: Feb 27, 2026
Updated: Feb 27, 2026

Access

Community

27 downloads

1 likes

0 views

Dataset Info

Author: Jarrodbarnes
Created: Feb 27, 2026
Updated: Feb 27, 2026

Oro Shoppingbench Frontier: 1,800 RL Rollouts for GPT-OSS-120B

Description

Use Cases

Strengths

Limitations

Provenance

Related Topics

Related Datasets

Quality Score

Community

Dataset Info

Community

Dataset Info