DataSalon

Discover quality datasets for AI training — aggregated from 40+ platforms, curated by AI.

ProductSearch Datasets Browse Topics Rankings Community API / MCP

ResourcesDocumentation Blog Changelog Status

LegalPrivacy Policy Terms of Service Cookie Policy

Mathematics & Statistics Datasets | DataSalon

All Categories

📐

Mathematics & Statistics

Mathematical datasets, statistical benchmarks, probability, optimization, operations research

2,485 datasets

Deschampsia Antarctic Plant Response Data with Biochemical Analyses

Deschampsia antarctica, one of two native flowering plants in Antarctica, is the subject of this biochemical study. The data includes analyses of leaf biochemistry, lipids via HPLC, antioxidants, and statistical comparisons with related species. It was produced by the organization SCIOPS and sourced from NASA's Earthdata platform.

TabularBiochemical AnalysisAntarctic EcologyEnvironmental ResponseDeschampsiaPlant Physiology+1

0 views

Mathematics & Statistics

Lipid Profiles of Antarctic Hairgrass Leaves and Roots

Antarctic plant data contains biochemical and chromatographic analysis of membrane lipids in Deschampsia antarctica leaves and roots. The dataset includes statistical analysis and comparisons with related species. It was published by SCIOPS on NASA EarthData in February 1992.

TabularLipidomicsSpecies ComparisonAntarctic FloraPlant biochemistryChromatography+1

0 views

Mathematics & Statistics

Rock-Fluid System Flow Studies from 17 Scientific Projects

The M2M Thematic Programme, funded by the British Geological Survey, produced data from 17 scientific investigation projects on fluid flow in heterogeneous rock. Research focused on scaling relationships, quantification of flow properties, statistical models, and rock-flow interactions across spatial and temporal scales.

Nerc DdcGeophysicsGeological ProcessesOilGeochemistry+1

0 views

Mathematics & Statistics

CO2 Flow Metering Data from Multi-Modal Sensor Fusion Prototype

A project dataset from the UKCCSRC Call 2 grant (UKCCSRC-C2-218) focused on developing a cutting-edge CO2 flow measurement system for carbon capture and storage pipelines. The British Geological Survey led research incorporating multi-modal sensing and statistical data fusion techniques, including sensors for differential pressure, ultrasonic, Coriolis, temperature, pressure, and electrical impedance. Experimental work tested the system under controlled conditions resembling practical CCS operations.

MultimodalFlow MeteringUnited KingdomIndustrial MonitoringMeasurement UncertaintyCarbon CaptureSynthetic+1

0 views

Mathematics & Statistics

Impact of Sub-Seabed CO2 Leakage on Benthic Macrofauna Communities

A 37-day sub-seabed CO2 release experiment assessed impacts on benthic macrofauna across four sampling zones. The study, published in the International Journal of Greenhouse Gas Control, documents rapid community changes during the leak and recovery 18 days post-injection. Data includes macrofaunal community structure and diversity metrics from zones at 0m, 25m, 75m, and 450m from the leak center.

TabularCarbon Capture And StorageBenthic MacrofaunaMarine EcologySediment ChemistryEnvironmental Impact+1

0 views

Mathematics & Statistics

CO2 Leak Detection Sensor Data from Scottish Seabed Experiment

Ardmucknish Bay on the Scottish west coast was the site of a controlled sub-seabed CO2 release experiment from May to October 2012. The study deployed three pCO2 sensor technologies alongside instruments measuring oxygen, temperature, salinity, and currents to monitor leakage. Researchers used a multivariate statistical approach to distinguish natural forcing from CO2 release signals.

TabularTime SeriesOceanographyEnvironmental monitoringMarine SensorsUnited KingdomCarbon Dioxide LeakageSynthetic+1

0 views

Mathematics & Statistics

CreativeBench: 1,859 Tasks for Evaluating Code Model Creativity

CreativeBench contains 1,859 JSONL records developed by Zethive in 2026 to evaluate the creative problem-solving capabilities of code models. The data is partitioned into two subsets focusing on combinatorial creativity (1,308 records) and exploratory creativity (551 records).

JSONSize Categories1 Kn10 KLibrarypolarsModalitytextLibrarymlcroissantLibrarydatasetsLibrarypandasRegionus+1

0 views

Mathematics & Statistics

R Script for Ant Colony Foundation Heat Stress Analysis

An R script provides the statistical analysis for a study on heat stress impact on ant colony foundation. The script is fully annotated, detailing the methods used by author Alice Roux. The dataset was last updated in March 2026 and is shared under a CC BY 4.0 license.

Ecology+1

0 views

Mathematics & Statistics

rxode2: Facilities for Simulating from ODE-Based Models

An R package providing facilities for running simulations from ordinary differential equation models, such as those used in pharmacometrics. The package, authored by Matthew Fidler, uses a compilation manager to translate ODE models into C for improved computational efficiency. It includes an event table object for specifying complex dosing regimens and sampling schedules.

TabularCompartmental ModelsComputer ScienceMathematicsPharmacometricsOdeApplied mathematicsOde Simulation+1

0 views

Mathematics & Statistics

Tidybayes: Tidy Data and Geoms for Bayesian Model Outputs

Tidybayes is an R package created by Matthew Kay for composing data and extracting, manipulating, and visualizing posterior draws from Bayesian models. It provides functions to extract tidy data frames of draws from models using JAGS, Stan, rstanarm, brms, and other backends. The package also includes ggplot2 geoms and stats for visualizing points with uncertainty intervals, eye plots, and fit curves with multiple uncertainty bands.

TabularBayesian ProbabilityBayesian StatisticsR PackageComputer ScienceData VisualizationArtificial IntelligencePosterior DrawsTidy Data+1

0 views

Mathematics & Statistics

Broom: Tidy Summaries for Statistical Models in R

Broom is an R package authored by David P. Robinson that converts statistical analysis objects into tidy data frames. It provides three core verbs—tidy, glance, and augment—to extract model coefficients, goodness-of-fit statistics, and observation-level information, respectively. This facilitates consistent reporting, plotting, and batch analysis of many models.

TabularModel SummaryData TidyingCartographyR PackageComputer ScienceBroomArchaeologyGeographyStatistical Modeling+1

0 views

Mathematics & Statistics

Human Body Inertial Properties Model with 25 Anthropometric Dimensions

A mathematical model predicts human body inertial properties using 25 standard anthropometric dimensions. The model's validity was tested against experimental data from 66 subjects, with center of gravity predictions generally within 0.7 inches and moments of inertia within 10 percent. Ernest P. Hanavan developed the model and a generalized computer program for calculating properties in any body position.

TabularInertial PropertiesComputer ScienceAnthropometryBiomechanicsMathematical modeling+1

0 views

Mathematics & Statistics

Python Optimization Dpo Sample: A Dataset for Machine Learning

A sample dataset for Python optimization, likely related to Direct Preference Optimization (DPO) methods. It was published by the author OptiRefine-Official on the Hugging Face platform and was last updated on April 12, 2026. The dataset's specific content, scale, and structure require verification after download.

TabularMachine LearningPythonOptimizationSample Data+1

0 views

Mathematics & Statistics

Image Optimization Results Using a Hybrid Pollination Method

Kaggle hosts a dataset on image optimization using a hybrid pollination method. The dataset likely contains inputs and outputs from an optimization process, with results formatted for MATLAB. Details on the data's size, origin, and creation date are unavailable.

TabularMachine LearningInputs OutputsHybrid PollinationImage OptimizationComputer VisionMatlab+1

0 views

Mathematics & Statistics

Brew & Bloom Coffee Shop Simulated Customer Waiting Times

A simulated dataset designed for statistical point estimation of customer waiting times. The data is described as a simulation, suggesting it was generated for methodological practice rather than collected from a real business. Its origin and specific size are unknown.

TabularCustomer Waiting TimeSimulated DataStatistical EstimationSynthetic+1

0 views

Mathematics & Statistics

RL Dataset V2: Reinforcement Learning Data for Formal Theorem Proving

A reinforcement learning dataset for formal theorem proving, containing two task subsets: whole proof generation and self-revision. The dataset is associated with the Goedel-Prover-V2 paper and was uploaded by Goedel-LM to Hugging Face on March 2, 2026.

TextParquetSize Categories10 Kn100 KLibrarypolarsLibrarydaskLanguageenModalitytextCodeLean4LibrarymlcroissantArxiv250803613LibrarydatasetsFormal MethodsCode GenerationRegionusReinforcement LearningTheorem ProvingLicenseapache 20+1

0 views

Mathematics & Statistics

Inti: Statistical Tools for Plant Science and Experimental Design

The 'inti' package is part of the 'inkaverse' project for developing procedures and tools used in plant science. It supports researchers during experiment planning, data collection, analysis, and scientific writing. The package was authored by Flavio Lozano‐Isla.

TabularComputer ScienceData ScienceAgricultural researchStatistical ProceduresPlant ScienceExperimental Design+1

0 views

Mathematics & Statistics

sjstats: Convenient Functions for Statistical Computations in R

sjstats is a collection of R functions for common statistical computations not directly provided by base R packages. The package provides shortcuts for statistical measures like Cramer's V, Phi, and effect size statistics like Eta or Omega squared. It also focuses on weighted variants of common statistical measures and tests, such as weighted standard error, mean, t-test, and correlation.

TabularR PackageComputer ScienceStatistical TestsStatistical ComputationsEffect SizeAlgorithmComputationWeighted Statistics+1

0 views

Mathematics & Statistics

PowerUpR: Statistical Power Analysis Tools for Multilevel Randomized Experiments

PowerUpR provides tools for calculating statistical power, minimum detectable effect size, and required sample sizes for multilevel randomized experiments. The package accommodates 14 types of designs for main treatment effects, seven for moderated effects, five for mediated effects, and several partially nested designs. It was created by Metin Buluş and is associated with the 'PowerUp!' Excel series.

TabularMedicineRandomized Controlled TrialRandomized ExperimentsThermodynamicsInternal MedicineComputer ScienceMathematicsResearch DesignMultilevel ModelPower PhysicsPhysicsPower AnalysisStatistics+1

0 views

Mathematics & Statistics

ehaGoF: 15 Goodness of Fit Statistics for Model Evaluation

Ecevit Eyduran's ehaGoF calculates 15 different goodness of fit criteria for statistical models. The metrics include standard deviation ratio, coefficient of determination, Akaike's information criterion, and root mean square error. The dataset's specific size, format, and temporal coverage are not detailed in the provided metadata.

TabularMathematicsModel EvaluationStatisticsGoodness Of Fit+1

0 views

PreviousPage 86 of 124Next