Loading...
Loading...
Mathematical datasets, statistical benchmarks, probability, optimization, operations research
2,446 datasets
Ethan Davis published model performance scores, compute profiling metrics, and MCMC convergence diagnostics on 2026-05-05. The data originates from a large-scale benchmark comparing Bayesian and frequentist pipelines for motor imagery EEG classification across twenty publicly available MOABB datasets and six pipeline pairs. Performance was assessed using six metrics including AUROC and MCC, with compute data covering training times.
12.8 MB of data from a techno-economic optimization study comparing a hybrid CO2 capture process to standalone methods. The dataset, authored by Sunny Pawar and last updated in April 2026, explores flue gas CO2 compositions ranging from 3.5 to 30 mol %. It contains results from process models and surrogate artificial neural networks used to calculate CO2 avoided costs.
Atlantic Ocean and eastern/central North Pacific tropical storm intensity forecasts from the Statistical Hurricane Intensity Prediction Scheme (SHIPS) model. This dataset was collected during the 2014 Hurricane and Severe Storm Sentinel (HS3) field campaign to study environmental and internal storm processes, including the role of the Saharan Air Layer and deep convection. SHIPS model inputs include GOES infrared imagery, and outputs are provided in ASCII format.
A simulation experiment using seabed mud content samples from the Geoscience Australian Marine Samples database to compare statistical and mathematical spatial interpolation techniques. The study assessed prediction accuracy using cross-validation and analyzed factors like region, sample density, and method. Outcomes can be applied to modeling physical properties for marine biodiversity prediction.
Supplementary Material 1 contains densitometric quantification data from Western blot analyses and cell viability assays for a study on the radiosensitizing effects of a dual-target inhibitor. The data includes measurements of protein levels like p-STAT3 and Ac-H3/H4, and cell viability results from CCK8 assays across A549, MDA-MB-231, and B16 cell lines. Qi Wang published this dataset on figshare in April 2026 under a CC-BY-4.0 license.
A crossover randomized trial of 60 sedentary college students measured hemodynamic and perceptual responses to 5-minute walking sessions with varying limb occlusion pressures. Data includes blood pressure, heart rate, perceived exertion, discomfort, and step counts recorded before, immediately after, and 5 minutes post-intervention. The dataset was published by Yuke Zhu on figshare in April 2026.
A 932.5 KB research paper by Kazuki Tomioka, last updated on 2026-05-08, proposing an estimation framework for panel stochastic frontier models. The framework accommodates heterogeneity through latent group structures and is demonstrated with an empirical application to U.S. commercial banking cost efficiency. The paper includes simulation studies and is available in PDF and TXT formats under a CC-BY-4.0 license.
R code for analyzing long-term maternal mortality associated with placental abruption and retention. The analysis is based on a cohort of 638,911 vaginal deliveries, with mortality rates of 6.4, 9.8, and 12.0 per 1,000 for normal, abruption, and retention groups, respectively. The code, authored by Sona Jasani and last updated in April 2026, performs statistical modeling to evaluate hazard ratios and temporal mortality patterns.
A 2026 study by Samuel J. Pearl examines the relationship between math anxiety and metacognitive monitoring in U.S. adults performing fraction arithmetic. The dataset includes responses from 685 adults who completed a fraction task, pre- and post-task performance judgments, and reported their math anxiety and self-concept. Findings suggest adults with higher math anxiety had less accurate monitoring of their performance.
A study presents a greedy optimization algorithm for allocating individuals to crosses in autogamous crop breeding programs. The algorithm was tested with an experimental barley resistance dataset and 60 simulated datasets, using population sizes of 400 and 1,200 and heritabilities of 0.7 and 0.9. The research was authored by Uche Joshua Okoye and includes user-friendly R code for improving breeding program efficiency.
Rev. Daniel Elis Axelrod created a dataset documenting the geometric reconstruction of a crop circle. The project involved mapping the crop circle's size and shape, estimating its area, and reconstructing it using circles and straight lines to derive extrusion volume and surface area. The dataset includes multiple file formats and was last updated on 2026-05-12.
A dataset containing results from a feature optimization study for predicting Baijiu yield during the steaming and distillation process of fermented material. The data likely includes key process parameters and predicted yields, generated by author Qiang Han using a Random Vector Functional Link Networks ensemble method. The dataset was last updated on May 6, 2026.
Model outputs from a Bayesian occupancy analysis, likely for Syrphidae (hoverfly) species. The 1.2 GB data file contains the full results from a model hosted on GitHub, provided by author Adam Duchesne. The dataset was last updated on May 20, 2026.
Andrew A. Manderson's 2026 paper presents a method for translating elicited prior information into informative joint priors for Bayesian models. The 11.1 MB PDF includes supplementary materials for reproducing the work and is licensed under CC-BY-4.0. The paper illustrates the approach with three examples: a cure-fraction survival model, a setting with prior information on R², and a nonlinear regression model.
A 30-year timeseries of ocean observations underpins statistical models for storm frequency and intensity on Australia's eastern and southern coast. The Australian Ocean Data Network provides this dataset, which includes multivariate storm statistics like maximum significant wave height, duration, and peak storm surge. The analysis focuses on clustered storm events and employs a peaks-over-threshold approach with a 2.93 m wave height threshold.
18 spatial interpolation methods were compared to predict seabed sand content across the Australian Exclusive Economic Zone. The study, based on samples from Geoscience Australia's Marine Samples Database extracted in August 2010, found that RFIDS and RFOK were among the most accurate methods across three tested regions. Model averaging further improved prediction accuracy, with the most accurate methods reducing error by up to 7%.
Statistical significance testing on Latin-based STR datasets (% mean ± std over 5 runs). The dataset is a 5.5 KB XLS file authored by Jian Guo and last updated on June 1, 2026. It is shared under a CC-BY-4.0 license on the figshare platform.
The Myanmar Statistical Yearbook 2022 provides an updated compendium of statistics on demographic, geographic, and socio-economic conditions. Statistics are compiled from administrative records of Ministries, Departments, Enterprises, and Private Agencies, as well as Census and Surveys conducted by Government Ministries and the Central Statistical Organization. The data is available in Excel format for years since 2018.
Malawi's 2021 Statistical Yearbook provides official district-level data on health and socioeconomic conditions. The annual publication aggregates statistics from government sources on topics including population, education, finance, and external trade. It is published by the National Statistical Office of Malawi and made available by the Health Cluster.
A summary of statistical analysis per figure from a study on adaptive nociceptive behavior in Drosophila larvae. The dataset, created by Jean-Christophe Boivin, was last updated on April 28, 2026. It is a 36.6 KB XLSX file containing results likely related to behavioral sensitization and neuronal activity.