Loading...
Loading...
Mathematical datasets, statistical benchmarks, probability, optimization, operations research
2,469 datasets
Experimental data from patch clamp recordings and calcium imaging examines stochastic resonance in sinoatrial node cells from rabbit hearts. The data was obtained in the current clamp configuration of perforated patch clamp, using sine waves and noise of different amplitudes. The dataset includes an Excel file detailing protocols applied in each experimental file.
Data from the ChooseWell 365 trial used to study network threats to causal inference. The dataset includes raw data and cleaning code, with statistical analysis files hosted separately. The replication package was authored by Douglas Levy and last updated in April 2026.
The Symbolic Pretraining Pile (SPT) is a dataset for symbolic and formal pre-training, mid-training, and supervised fine-tuning. It is procedurally generated on CPU and can be scaled to trillion tokens, with adjustable difficulty. The dataset was created by reasoning-core and last updated on March 23, 2026.
A 2026 analysis dataset from Harvard Dataverse, authored by Shelest, Moshe Pavel, enables replication of research on intraregional trade gaps in Latin America. It combines trade data from CEPII BACI with control variables from the World Bank and V-Dem Institute. The repository includes cleaned data for regression analysis, documented scripts, and external links to the primary BACI dataset.
Scott Hadland's definitions support analysis of Medicaid data for youth opioid use disorder care. The dataset likely contains operational definitions for measuring treatment cascade stages. It was created for NIH/NIDA grant R01DA045085 and last updated in April 2026.
18.1 KB of statistical analysis results for Klebsiella pneumoniae traits, shared by Julie Le Bris on figshare. The dataset, last updated in March 2026, contains multifactorial ANOVA results for traits related to capsule locus, host interaction, and fitness assays. It is licensed under CC-BY-4.0.
17.0 KB of statistical analysis results from stepwise regression for traits analyzed in a study of Klebsiella pneumoniae. The dataset, authored by Julie Le Bris, was last updated on March 25, 2026, and is shared under a CC-BY-4.0 license. It is stored in an XLSX file format.
A statistical analysis of bond-break parameters for composite models at different sizes. The dataset was authored by Hang Liu and last updated on March 25, 2026. It is a 5.5 KB Excel file released under a CC-BY-4.0 license.
A database for Internet user authentication and identity-proofing, published by the Social Security Administration. The dataset was last updated on April 3, 2026. Its specific contents and scale are not detailed in the provided metadata.
A Bayesian network with 27 nodes and 52 arcs, modeling insurance-related variables. The network contains 1008 parameters and was created by J. Binder, D. Koller, S. Russell, and K. Kanazawa, with a reference publication from 1997.
A Bayesian network model for forecasting severe weather, specifically hail. The model contains 56 nodes and 66 arcs, with 2656 parameters. It was authored by B. Abramson, J. Brown, W. Edwards, A. Murphy, and R. L. Winkler and published in the International Journal of Forecasting in 1996.
413 nodes and 602 arcs form a Bayesian network for modeling diabetes. The network was authored by S. Andreassen, R. Hovorka, J. Benn, K. G. Olesen, and E. R. Carson and referenced in a 1991 conference paper on a model-based approach to insulin adjustment.
413 nodes and 602 arcs define this discrete Bayesian network for modeling diabetes. The network, authored by S. Andreassen, R. Hovorka, J. Benn, K. G. Olesen, and E. R. Carson, contains 429,409 parameters and was presented at the 3rd Conference on Artificial Intelligence in Medicine in 1991.
A software package by Marco Scutari of the Dalle Molle Institute for Artificial Intelligence Research implementing algorithms for Bayesian network structure learning, parameter learning, and inference. It supports constraint-based, score-based, pairwise, and hybrid learning methods for discrete, Gaussian, and conditional Gaussian networks, along with classifiers, utility functions, and estimation techniques. Development snapshots are available from the project website.
RSiena provides functions for simulation-based estimation of stochastic actor-oriented models. The package handles longitudinal network data, which can be single or multivariate, directed, non-directed, or two-mode, and includes associated actor variables. It was authored by Tom A. B. Snijders, with methodology detailed in a 2017 review.
SpatialExtremes is a software package for the statistical modelling of spatial extremes, created by Mathieu Ribatet. It provides tools for simulation, analysis, fitting, and prediction using max-stable processes, copula, and Bayesian hierarchical models. The package is supported by key statistical references including Davison et al. (2012), Padoan et al. (2010), and Dombry et al. (2013).
ACE SWICS 2.0 is a time series dataset from the NASA Advanced Composition Explorer (ACE) satellite, providing measurements of heavy ions in the solar wind. The dataset begins after August 23, 2011, following a hardware anomaly, and includes elemental abundance, charge state composition, and kinetic distribution data. It represents a continuation of unique heavy ion measurements not available from other instruments, with new analysis methods developed for the post-anomaly instrument state.
High-fidelity operational data from a petrochemical process spanning five years. The dataset is intended for energy intensity prediction and sensor analysis. The author, organization, and specific file formats are unknown.
Class B Units represent surveyed footprints of residential or commercial units on the ground in the Australian Capital Territory. The data is continuously maintained by the ACT Government based on registered Unit Plans and computed design to fit within survey control. Author Greg Tankard last updated the dataset in March 2026.
Simon Dufour's replication data and analysis scripts support an article prepared for the Journal of Dairy Science. The dataverse contains datasets used to estimate herd prevalence of contagious and environmental mastitis pathogens in Canadian dairy herds using Bayesian latent class models. It was last updated on April 25, 2026.