DataSalon

Discover quality datasets for AI training — aggregated from 40+ platforms, curated by AI.

ProductSearch Datasets Browse Topics Rankings Community API / MCP

ResourcesDocumentation Blog Changelog Status

LegalPrivacy Policy Terms of Service Cookie Policy

Mathematics & Statistics Datasets | DataSalon

All Categories

📐

Mathematics & Statistics

Mathematical datasets, statistical benchmarks, probability, optimization, operations research

2,485 datasets

ChemCoT: 10K-100K Records for Molecular Editing and Reaction Reasoning

ChemCoT provides between 10,000 and 100,000 text-based records for chemical reasoning tasks, released by IDEA-AI4S in 2025. The collection is organized into four modules covering molecular editing, property optimization, structural understanding, and chemical reactions to support Chain-of-Thought modeling.

Size Categories10 Kn100 KLanguageenModalitytextBiologyChemistryArxiv250521318RegionusLicensemit+1

0 views

Mathematics & Statistics

Project Selection Optimization Model with Budget and Staff Constraints

Kaggle dataset containing a technical report and mathematical model for optimizing project selection. The model likely includes constraints related to budget, staff allocation, and logical dependencies. The dataset's author, organization, and temporal coverage are unknown.

TabularOptimizationProject ManagementMathematical modeling+1

0 views

Mathematics & Statistics

GlobSnow v3.0: Northern Hemisphere Snow Water Equivalent, 1980-2018

Northern Hemisphere terrestrial snow water equivalent (SWE) data from 1980 to 2018. The dataset provides monthly and monthly-bias-corrected SWE estimates constructed by Kari Luojus and colleagues using a Bayesian data assimilation method that combines satellite-based passive microwave radiometer data with ground-based snow depth observations.

Time SeriesGeospatialClimate ScienceEnvironmental scienceSnow Water EquivalentHydrologyGeologySnowGeomorphologyWater Equivalent+1

0 views

Mathematics & Statistics

Chinese Remainder Theorem: Formal Mathematical Notation and Properties

The dataset 'Chinese Remainder Theorem' is sourced from paperswithcode. It likely contains formal mathematical notation and propositions related to arithmetic, integers, and the gcd functor. The description introduces rules for variable types and lists several mathematical propositions.

TextCombinatoricsDiscrete MathematicsArithmeticInteger Computer ScienceMathematicsFunctorRemainderCommutative PropertyNatural Number+1

0 views

Mathematics & Statistics

U.S. Foreign Commerce Statistics, 1820 to 1856

An historical and statistical account of U.S. foreign commerce compiled by William H Becker. The dataset covers the period from 1820 to 1856 and is hosted on the paperswithcode platform. The specific data format, size, and variables are not detailed in the available metadata.

TabularHistoryGenealogyInternational TradeStatisticsEconomic HistoryHistorical Commerce+1

0 views

Mathematics & Statistics

Hispanic Serving Institutions Statistical Trends, 1990-1999

1990 to 1999 statistical trends for Hispanic Serving Institutions (HSIs). The dataset was authored by Angela L. Holmes and is hosted on the paperswithcode platform. Its specific columns and size are not detailed in the provided metadata.

TabularHispanic Serving InstitutionsEducation StatisticsEconometricsRegional ScienceMathematicsEconomicsDemographic TrendsGeographyStatisticsHigher EducationPolitical Science+1

0 views

Mathematics & Statistics

Hispanic Serving Institutions Statistical Trends from 1990 to 1999

Hispanic Serving Institutions (HSIs) in the United States are the focus of this dataset. It likely contains statistical trends for these institutions over a decade. The dataset was authored by Christina Stearns and published on PapersWithCode.

TabularHispanic Serving InstitutionsRegional ScienceGeographyStatisticsDemographicsHigher Education+1

0 views

Mathematics & Statistics

Longitudinal Study of 1995-96 Beginning Postsecondary Students: Six-Year Outcomes

A statistical analysis report tracking a cohort of students who began postsecondary education in the United States during the 1995-96 academic year. The report, authored by Lutz Berkner, analyzes their educational and life outcomes six years later. The dataset likely contains tabular data on student demographics, enrollment patterns, persistence, and degree attainment.

TabularEducation StatisticsMathematicsStatistical AnalysisPsychologyDescriptive StatisticsLongitudinal StudyStatisticsStudent OutcomesMathematics EducationHigher Education+1

0 views

Mathematics & Statistics

U.S. Criminal Victimization Statistics for 2002

Statistical tables on criminal victimization in the United States for the year 2002. The dataset was published on the paperswithcode platform and is attributed to author Cathy Maston. The specific variables and scale of the data are not detailed in the available metadata.

TabularComputer ScienceMathematicsPsychologyCrime StatisticsLawSocial ScienceUnited StatesStatisticsCriminologyPolitical ScienceCriminal VictimizationComputer Security+1

0 views

Mathematics & Statistics

Longitudinal Study of Australian Children: 2011 Annual Statistical Report

Australian longitudinal data on child development and family dynamics, published in the 2011 Annual Statistical Report. The report was authored by Brigit Maguire and is hosted on the paperswithcode platform. The specific data format, size, and variables are not detailed in the provided metadata.

Tabular🇦🇺 AustraliaChild DevelopmentMathematicsLongitudinal StudyGeographyStatisticsDemographics+1

0 views

Mathematics & Statistics

Delivery Route Optimization Dataset for Real-World ML Pipelines

Synthetic, multi-table data designed to simulate real-world machine learning pipelines. The dataset is described as raw and unclean, likely containing operational data for logistics and delivery scenarios. Its origin and specific scale are unknown.

TabularDelivery Route OptimizationLogisticsMachine Learning PipelineSynthetic DataSynthetic+1

0 views

Mathematics & Statistics

California Individual Taxpayer Statistics and Program Activities 2024

The State of California's 2024 annual report details major program activities and provides a statistical profile of individual taxpayers. The document is a PDF summary published by the state government, last updated in March 2026.

TextEnglishGovernment ReportUnited StatesTax PolicyTaxpayer Statistics+1

0 views

Mathematics & Statistics

Statistical Field Theory: Topics in Classical and Quantum Physics

A collection of topics in statistical and quantum field theory sourced from paperswithcode. The content covers classical equilibrium statistical mechanics, magnetic systems, the Ising model, renormalization group, path integrals, and relativistic quantum field theory. The author, organization, and temporal coverage are unknown.

TextQuantum ElectrodynamicsQuantum Field TheoryBenchmarkPhysicsStatistical PhysicsIsing ModelTheoretical PhysicsField Mathematics+1

0 views

Mathematics & Statistics

Spatial Optimization MC: Model and Case Data

Spatial optimization data published on Kaggle. The dataset's title suggests it relates to optimization problems with a spatial component, such as facility location or resource allocation. Specific details regarding size, columns, and creation are unavailable from the provided metadata.

TabularSpatial OptimizationOperations ResearchMathematical modeling+1

0 views

Mathematics & Statistics

Laboratory Nitrogen Isotope Data for Microbial Biomass Under Early Earth Conditions

British Geological Survey provides nitrogen isotopic composition (d15N) data from microbial biomass grown under controlled N2-fixing laboratory conditions. The dataset includes metadata describing experimental parameters like variable CO2 and O2 supplies, designed to mimic early Earth environments. It also incorporates comparative data from the wider literature and results of statistical tests.

TabularNitrogen IsotopesPaleobiologyMicrobial BiomassExperimental ScienceGeochemistry+1

0 views

Mathematics & Statistics

Puerto Rico GDP Index from 1900 to 1940

Offering a new GDP index for Puerto Rico covering the period from 1900 to 1940. It was created by John Devereux to analyze economic growth during direct American rule and the Great Depression. The dataset supports the replication of findings on income per capita trends.

0 views

Mathematics & Statistics

U.S. Statistical Abstract: Federal Summary of National Statistics Since 1878

The United States is the geographic scope of this official federal summary of statistics, first published in 1878. It provides over 1,400 tables of benchmark measures on the demographic, housing, social, political, and economic condition of the country. The dataset was authored by Cindy Gillham and is aggregated from paperswithcode.

TabularGovernment DataEconometricsMathematicsBenchmarkEconomicsUnited StatesGeographyFinanceStatisticsDemographics+1

0 views

Mathematics & Statistics

MCO: Multiple Criteria Optimization Algorithms and Test Functions

A collection of functions for solving multiple criteria optimization problems using the NSGA-II genetic algorithm. The dataset also includes a set of test functions for benchmarking. It was authored by Olaf Mersmann and is hosted on PapersWithCode.

TabularOptimization AlgorithmMulti Objective OptimizationMathematical OptimizationComputer ScienceMathematicsGenetic AlgorithmsAlgorithmOptimization AlgorithmsTest Functions+1

0 views

Mathematics & Statistics

bmlm: Bayesian Multilevel Mediation Models for Stan

Matti Vuorre's bmlm package enables easy estimation of Bayesian multilevel mediation models. The underlying data likely contains variables for modeling mediation effects within hierarchical structures. The dataset is associated with a method for statistical analysis in fields like psychology and social science.

TabularBayesian ProbabilityBayesian StatisticsEconometricsComputer ScienceMediation AnalysisMathematicsPsychologyMediationSocial ScienceArtificial IntelligenceMultilevel ModelStatisticsSociology+1

0 views

Mathematics & Statistics

mrgsolve: ODE-Based Pharmacokinetic and Systems Biology Simulations

mrgsolve enables fast simulation from ordinary differential equation (ODE) based models. The tool is typically employed in quantitative pharmacology and systems biology research. It was authored by Kyle Baron and is listed on the paperswithcode platform.

TabularComputer ScienceMathematicsPharmacokineticsQuantitative PharmacologyOdeSystems BiologyApplied mathematicsOde Simulation+1

0 views

PreviousPage 97 of 124Next