OR-Space: A Benchmark for LLM Agents in Industrial Optimization Workspaces

Name: OR-Space: A Benchmark for LLM Agents in Industrial Optimization Workspaces
Creator: Chenyu-Zhou
Published: 2026-05-07T10:19:01
Keywords: Llm Agents, Operations Research, Benchmark, Optimization, Workspace, Multimodal

by Chenyu-ZhouUpdated 2mo ago

Available on 1 platform

Sign in to view source links and access this dataset

Description

OR-Space is a full-lifecycle workspace benchmark for evaluating LLM agents on industrial optimization tasks. The benchmark, created by Chenyu-Zhou, structures each instance with separate files for business requirements, parameters, source code, and solver artifacts. It was last updated on May 18, 2026.

Use Cases

Benchmarking LLM agent performance on multi-step optimization tasks based on the described workspace structure.
Developing agents that interact with executable, multi-file workspaces rather than generating one-shot solutions.
Studying how LLMs recover and maintain optimization models through file system interaction as described in the benchmark.

Strengths

The benchmark is designed for a full lifecycle of operations research work, forcing agent interaction.
Each instance contains multiple structured file types (requirements, parameters, code, artifacts) as described.

Limitations

The specific number of instances, rows, and file formats are unknown.
Column-level documentation is absent; field semantics must be inferred after download.
Data may reflect the bias inherent to the benchmark's construction methodology on Hugging Face.

Provenance

Source: Hugging Face dataset by Chenyu-Zhou.
Collection Method: Likely created as a benchmark for evaluating LLM agents.
Freshness: Last updated 2026-05-18 07:37:41.

License is unknown; users must verify permissions before use.

Multimodal Llm Agents Operations Research Benchmark Optimization Workspace

Related Datasets

Quality Score

D37

Description

39

Source

36

Reputation

43

Access

26

Community

74 downloads

2 likes

0 views

Dataset Info

Author: Chenyu-Zhou
Created: May 7, 2026
Updated: May 18, 2026
Last synced: Jun 6, 2026

Access

26

Community

74 downloads

2 likes

0 views

Dataset Info

Author: Chenyu-Zhou
Created: May 7, 2026
Updated: May 18, 2026
Last synced: Jun 6, 2026

OR-Space: A Benchmark for LLM Agents in Industrial Optimization Workspaces

Description

Use Cases

Strengths

Limitations

Provenance

Related Topics

Related Datasets

Quality Score

Community

Dataset Info

Community

Dataset Info