Sign in to view source links and access this dataset
Description
NLCO is a benchmark dataset containing 6,450 total samples across 43 tasks for evaluating large language models on natural-language combinatorial optimization problems. The dataset, created by summer142857jiang and last updated in April 2026, is organized into 129 CSV files with 50 samples per file and three difficulty tiers: Set-S, Set-M, and Set-L.
Use Cases
Benchmarking LLM performance on combinatorial optimization tasks based on the 43 described tasks.
Evaluating LLM reasoning across difficulty tiers based on the described Set-S, Set-M, and Set-L buckets.
Analyzing model performance on constrained problem-solving based on the dataset's focus on a 'constrained world'.
Comparing LLM capabilities on structured reasoning tasks based on the benchmark's evaluation design.
Strengths
Contains 6,450 total samples for evaluation.
Provides 43 distinct reasoning tasks.
Organizes problems into three defined difficulty tiers (Set-S, Set-M, Set-L).
Limitations
Column-level documentation is absent; field semantics must be inferred after download.
Row count per task is unknown, which may limit suitability assessment.
Description metadata is limited; actual data quality requires manual inspection after download.
Provenance
Source
huggingface user summer142857jiang
Collection Method
Likely created as a normalized release for benchmarking, as described.
Time Range
null
Freshness
Last updated 2026-04 06:58:26; freshness should be verified.
Geography
null
License is unknown; restrictions should be verified before use.