Loading...
Loading...
Mathematical datasets, statistical benchmarks, probability, optimization, operations research
2,485 datasets
ChemCoT provides between 10,000 and 100,000 text-based records for chemical reasoning tasks, released by IDEA-AI4S in 2025. The collection is organized into four modules covering molecular editing, property optimization, structural understanding, and chemical reactions to support Chain-of-Thought modeling.
Kaggle dataset containing a technical report and mathematical model for optimizing project selection. The model likely includes constraints related to budget, staff allocation, and logical dependencies. The dataset's author, organization, and temporal coverage are unknown.
Northern Hemisphere terrestrial snow water equivalent (SWE) data from 1980 to 2018. The dataset provides monthly and monthly-bias-corrected SWE estimates constructed by Kari Luojus and colleagues using a Bayesian data assimilation method that combines satellite-based passive microwave radiometer data with ground-based snow depth observations.
The dataset 'Chinese Remainder Theorem' is sourced from paperswithcode. It likely contains formal mathematical notation and propositions related to arithmetic, integers, and the gcd functor. The description introduces rules for variable types and lists several mathematical propositions.
An historical and statistical account of U.S. foreign commerce compiled by William H Becker. The dataset covers the period from 1820 to 1856 and is hosted on the paperswithcode platform. The specific data format, size, and variables are not detailed in the available metadata.
1990 to 1999 statistical trends for Hispanic Serving Institutions (HSIs). The dataset was authored by Angela L. Holmes and is hosted on the paperswithcode platform. Its specific columns and size are not detailed in the provided metadata.
Hispanic Serving Institutions (HSIs) in the United States are the focus of this dataset. It likely contains statistical trends for these institutions over a decade. The dataset was authored by Christina Stearns and published on PapersWithCode.
A statistical analysis report tracking a cohort of students who began postsecondary education in the United States during the 1995-96 academic year. The report, authored by Lutz Berkner, analyzes their educational and life outcomes six years later. The dataset likely contains tabular data on student demographics, enrollment patterns, persistence, and degree attainment.
Statistical tables on criminal victimization in the United States for the year 2002. The dataset was published on the paperswithcode platform and is attributed to author Cathy Maston. The specific variables and scale of the data are not detailed in the available metadata.
Australian longitudinal data on child development and family dynamics, published in the 2011 Annual Statistical Report. The report was authored by Brigit Maguire and is hosted on the paperswithcode platform. The specific data format, size, and variables are not detailed in the provided metadata.
Synthetic, multi-table data designed to simulate real-world machine learning pipelines. The dataset is described as raw and unclean, likely containing operational data for logistics and delivery scenarios. Its origin and specific scale are unknown.
The State of California's 2024 annual report details major program activities and provides a statistical profile of individual taxpayers. The document is a PDF summary published by the state government, last updated in March 2026.
A collection of topics in statistical and quantum field theory sourced from paperswithcode. The content covers classical equilibrium statistical mechanics, magnetic systems, the Ising model, renormalization group, path integrals, and relativistic quantum field theory. The author, organization, and temporal coverage are unknown.
Spatial optimization data published on Kaggle. The dataset's title suggests it relates to optimization problems with a spatial component, such as facility location or resource allocation. Specific details regarding size, columns, and creation are unavailable from the provided metadata.
British Geological Survey provides nitrogen isotopic composition (d15N) data from microbial biomass grown under controlled N2-fixing laboratory conditions. The dataset includes metadata describing experimental parameters like variable CO2 and O2 supplies, designed to mimic early Earth environments. It also incorporates comparative data from the wider literature and results of statistical tests.
Offering a new GDP index for Puerto Rico covering the period from 1900 to 1940. It was created by John Devereux to analyze economic growth during direct American rule and the Great Depression. The dataset supports the replication of findings on income per capita trends.
The United States is the geographic scope of this official federal summary of statistics, first published in 1878. It provides over 1,400 tables of benchmark measures on the demographic, housing, social, political, and economic condition of the country. The dataset was authored by Cindy Gillham and is aggregated from paperswithcode.
A collection of functions for solving multiple criteria optimization problems using the NSGA-II genetic algorithm. The dataset also includes a set of test functions for benchmarking. It was authored by Olaf Mersmann and is hosted on PapersWithCode.
Matti Vuorre's bmlm package enables easy estimation of Bayesian multilevel mediation models. The underlying data likely contains variables for modeling mediation effects within hierarchical structures. The dataset is associated with a method for statistical analysis in fields like psychology and social science.
mrgsolve enables fast simulation from ordinary differential equation (ODE) based models. The tool is typically employed in quantitative pharmacology and systems biology research. It was authored by Kyle Baron and is listed on the paperswithcode platform.