Loading...
Loading...
Mathematical datasets, statistical benchmarks, probability, optimization, operations research
2,469 datasets
109-node Bayesian network modeling lymph node pathology, developed as part of the Pathfinder expert system project. The network contains 195 arcs and 72,079 parameters, with an average Markov blanket size of 3.82. Authored by D. Heckerman, E. Horwitz, and B. Nathwani, this dataset is a seminal benchmark for probabilistic reasoning, originally published in 1992.
The Pathfinder Bayesian Network sample 3 is a discrete, very large graph structure for expert systems. It contains 109 nodes, 195 arcs, and 72,079 parameters, with an average Markov blanket size of 3.82. The network was authored by D. Heckerman, E. Horwitz, and B. Nathwani, with foundational research published in 1992.
56-node Bayesian network designed for forecasting severe hail. The model, authored by B. Abramson, J. Brown, W. Edwards, A. Murphy, and R. L. Winkler, contains 66 arcs and 2656 parameters, representing complex probabilistic relationships. It was published in the International Journal of Forecasting in 1996.
An operations research benchmark adapted from ReEvo, which originally consists of six types of combinatorial optimization problems. The benchmark has been extended with a cooperative driving problem involving complex simulation environments using SUMO. The dataset was created by qiliuchn and last updated on April 3, -2026.
A 5.5 KB Excel file contains analysis of variance results for optimizing fermentation periods. The data compares fibrinolytic enzyme production between wild-type and mutated Oidiodendron maius strains, authored by Hina Sajid and last updated in March 2026. Specific row and column counts are not provided.
Analysis of variance data for optimizing substrate concentration to produce fibrinolytic enzymes using wild-type and mutated Oidiodendron maius fungal strains. The dataset is provided in an XLS file format and is licensed under CC BY 4.0. Author Hina Sajid last updated the 5.5 KB file in March 2026.
5.5 KB of tabular data from an analysis of variance (ANOVA) study on fibrinolytic enzyme production. The dataset compares wild-type and mutated Oidiodendron maius fungal strains, focusing on inoculum size optimization. It was authored by Hina Sajid and last updated in March 2026.
An analysis of variance table details the optimization of pH for fibrinolytic enzyme production from wild and mutated strains of Oidiodendron maius. The dataset was authored by Hina Sajid and shared under a CC BY 4.0 license in March 2026.
An analysis of variance (ANOVA) table for optimizing temperature in fibrinolytic enzyme production from wild and mutated strains of Oidiodendron maius. The dataset is provided by author Hina Sajid and was last updated in March 2026. It is a small, 5.5 KB Excel file containing tabular data.
Daily statistical summaries of land surface parameters from NASA's Multi-angle Imaging SpectroRadiometer (MISR) instrument. The product includes directional hemispherical reflectance (DHR), FPAR, NDVI, and BRF model parameters, reported on a global 0.5-degree by 0.5-degree geographic grid. It is produced by the LARC_CLOUD organization using data from nine pushbroom cameras across four spectral bands.
Kanika Rawat authored an R script for statistical analysis, last updated on April 20, 2026. The code is associated with a study on how prior predation-risk experience influences individual antipredator responses within groups. It includes codes for all models and their bootstrap diagnostics.
PED-R is a software tool for calculating Pedigree matrices using R. The tool outputs the Pedigree matrix in CSV and TXT formats for use in GAP software to generate heatmaps, and also provides a GEXF file for network visualization in Gephi. Francisco RodrΓguez authored this version 1.0 resource, which was last updated on 2026-04-20.
Anderson-Barker Supplementary Data Files contain original digitized LabChart8 recordings underlying a 2026 Experimental Physiology paper. The dataset includes raw time-series data from guinea pig isolated heart preparations, alongside derived numerical, graphical, and statistical files in GraphPad Prism format. It supports research on cardiac functional assessment, contractility, and QT interval correction methods.
220,000 reasoning traces distilled from the Hunter Alpha model via OpenRouter. The dataset contains 800 million tokens and was created by user 'ianncity', with a last recorded update on March 15, 2026. It is composed of Math (30%), Coding (30%), Science (15%), Computer Science (15%), and Creative Writing (10%) content.
SeifElden2342532 created a dataset of 3,000 Python code snippet pairs demonstrating performance bottlenecks and their optimized versions. Each entry includes complexity analysis for time and space and a technical explanation for the optimization. The dataset was last updated on March 31, 2026.
The Hebrew Bible (Tanakh) is represented in this structured, quantitative dataset extracted from the Leningrad Codex. It provides numerical data points, such as word frequencies and verse metrics, transformed into a machine-readable CSV format for computational analysis. The dataset was created by Guy Shaked of TwoHillsLab Dataverse and was last updated on April 10, 2026.
Statistical validation metrics for the DRFC clustering algorithm and baseline models across two benchmark datasets. The 5.5 KB XLS file, authored by Xiao Fu and last updated in March 2026, is licensed for open use under CC-BY-4.0. It likely contains performance measures like accuracy for evaluating deep feature extraction models on remote sensing images.
The Data Management and Sharing Plan for the project 'SCH: Novel Multi-View Statistical Machine Learning for Alzheimer's Disease' outlines the strategy for managing and sharing scientific data generated during the research. The plan was authored by Hongtu Zhu and is hosted on the ODUM Harvested Dataverse platform. It was last updated on May 4, 2026.
SAIRfoundation released this dataset of 1,200 mathematical training problems in 2026, curated from the Equational Theories Project raw implication dataset. It serves as the primary training set for Stage 1 of the Mathematics Distillation Challenge: Equational Theories competition.
Queensland Justice publishes monthly gaming machine data aggregated by Statistical Area 4. The dataset is updated regularly, with the last update recorded on March 19, 2026. It is available in CSV and Excel formats under a Creative Commons Attribution 4.0 license.