Loading...
Loading...
Available on 1 platform
Sign in to view source links and access this dataset
Graphwalks is a long-context benchmark containing between 1,000 and 10,000 records, released by OpenAI in early 2026. It evaluates multi-hop reasoning by requiring models to execute graph operations, such as breadth-first search, on provided directed edge lists.
The dataset is provided in Parquet format and is compatible with Polars, Dask, and the Hugging Face Datasets library.