Sign in to view source links and access this dataset
Description
OR-Space is a full-lifecycle workspace benchmark for evaluating LLM agents on industrial optimization tasks. The benchmark, created by Chenyu-Zhou, structures each instance with separate files for business requirements, parameters, source code, and solver artifacts. It was last updated on May 18, 2026.
Use Cases
Benchmarking LLM agent performance on multi-step optimization tasks based on the described workspace structure.
Developing agents that interact with executable, multi-file workspaces rather than generating one-shot solutions.
Studying how LLMs recover and maintain optimization models through file system interaction as described in the benchmark.
Strengths
The benchmark is designed for a full lifecycle of operations research work, forcing agent interaction.
Each instance contains multiple structured file types (requirements, parameters, code, artifacts) as described.
Limitations
The specific number of instances, rows, and file formats are unknown.
Column-level documentation is absent; field semantics must be inferred after download.
Data may reflect the bias inherent to the benchmark's construction methodology on Hugging Face.
Provenance
Source
Hugging Face dataset by Chenyu-Zhou.
Collection Method
Likely created as a benchmark for evaluating LLM agents.
Freshness
Last updated 2026-05-18 07:37:41.
License is unknown; users must verify permissions before use.