Loading...
Loading...
Available on 1 platform
Sign in to view source links and access this dataset
SciR is a multi-domain benchmark for evaluating large language models on three forms of scientific reasoning: deductive logic, inductive rule discovery, and causal discovery. It includes parametric difficulty curves and a controlled natural-language vs. scientific-prose-obfuscation contrast. The dataset was created by sci-reason and was last updated on Hugging Face in June 2026.
License is unknown; terms of use must be verified.