Loading...
Loading...
Mathematical datasets, statistical benchmarks, probability, optimization, operations research
2,444 datasets
A 2026 analysis by Rika Anderson identifies clusters of orthologous genes (COGs) whose abundances correlate with nutrient concentrations in the global ocean. The dataset is derived from 139 Tara Oceans metagenomic samples, analyzing 4,787 COGs against environmental metadata including phosphate, nitrate/nitrite, oxygen, and modeled iron. Statistical models were applied to control for confounding effects from variables like temperature, depth, and salinity.
Chilean women aged 50–81 years from the Magallanes Region were assessed for cognitive performance across three climacteric stages. The dataset, created by Jonathan Lühr-Henríquez and last updated in 2026, contains results from the Addenbrooke’s Cognitive Examination-Revised and Symbol Digit Modalities Test for 360 participants. It supports a Bayesian multivariate analysis of how the relationship between age and cognition varies by menopausal stage.
Chinese short-form dramas on YouTube are analyzed in a dataset of 895 videos uploaded between April and October 2025. The data, uploaded by Yunqiu Yan, includes channel subscriber base size, video duration, and upload timing to model predictors of cumulative view counts. Linear regression, multilayer perceptron, and random forest models were used, with the random forest achieving an R² of 0.666.
A dataset from figshare describes compounds identified as inhibitors of YAP-TEAD-dependent transcription. The data, authored by Timo Heinrich and last updated in May 2026, includes results from a TEAD-reporter-based cellular screen, thermal shift assays, and in vivo xenograft studies. Derivatives with varied selectivity profiles and optimized physicochemical properties were generated.
Hua Wang authored a paper introducing the Edgeworth Accountant, an analytical method for composing differential privacy guarantees. The method leverages the f-differential privacy framework to track privacy loss under composition, providing non-asymptotic (ε,δ)-differential privacy bounds. The associated files, last updated on 2026-05-18, are available on figshare under a CC-BY-4.0 license.
32.8 KB of text data from figshare, authored by Hitoshi Asakawa and last updated in May 2026. The dataset describes the direct observation of association-dissociation dynamics in densely assembled pillar[5]arene host structures using frequency modulation and high-speed atomic force microscopy. It reveals positive cooperativity in guest binding at solid-liquid interfaces.
A text dataset presents direct observations of host-guest complexation dynamics at solid-liquid interfaces. The data likely contains descriptions of frequency modulation and high-speed atomic force microscopy (AFM) results showing cooperative binding in densely assembled pillar[5]arene host structures. The dataset was authored by Hitoshi Asakawa and last updated on May 20, 2026.
Yi-Hui Zhou presents a hierarchical Bayesian framework for harmonizing collision cross section (CCS) measurements across ion mobility platforms. The dataset includes 840 measurements for 347 compounds from a multilaboratory study, used to validate the framework. The approach reduced intertechnology CCS variability by approximately 95% and lowered median absolute percentage error from 8.9% to 3.2%.
PlanetGSD 1.0 is a cross-planetary grain-size distribution dataset containing separate tables for Terrestrial, Lunar, and Martian soils. The dataset includes 3.7 MB of data in TXT and XLSX formats, authored by Jun Zhang and last updated on 2026-05-22. It also provides site descriptions, image references for Martian soils, and statistical simulation and fitting code.
120 generated benchmark instances for a reconfigurable assembly line scheduling problem. The dataset supports a study proposing a Q-learning-based multi-objective hyper-heuristic algorithm to optimize product sequencing, reconfiguration cost, workload equalization, and logistics leveling. It was authored by Haoyi Zhao and last updated on 2026-05-21.
Average 1-NHV values of QMOHH and its variants are presented in a 17.5 KB Excel file. The data was uploaded by Haoyi Zhao to figshare and last updated on May 21, 2026. It likely contains performance metrics from a computational study on reconfigurable assembly line scheduling.
120 generated benchmark instances for a reconfigurable assembly line scheduling problem. The dataset supports a study proposing a Q-learning-based multi-objective hyper-heuristic algorithm and is provided by author Haoyi Zhao in an XLS file. It was last updated on 2026-05-21.
A set of Pareto-optimal solutions achieved by a multi-objective mathematical model for reconfigurable assembly line scheduling. The dataset, shared by Haoyi Zhao on figshare, was last updated on 2026-05-21. It contains results from a numerical case study and computational experiments on 120 benchmark instances.
5.5 KB of part-requirement matrices for different product types, supporting a study on reconfigurable assembly line scheduling. The dataset, authored by Haoyi Zhao and last updated in May 2026, was used to formulate a multi-objective mathematical model minimizing reconfiguration cost and balancing production workloads. A computational study on 120 generated benchmark instances demonstrated the performance of a proposed Q-learning-based hyper-heuristic algorithm.
A figshare dataset by Haoyi Zhao, last updated on 2026-05-21. It contains results from a numerical case study and computational experiments on 120 generated benchmark instances for a reconfigurable assembly line scheduling problem. The study formulates a multi-objective mathematical model and proposes a Q-learning-based hyper-heuristic algorithm.
pone.0348884.t011 contains benchmark instances for a reconfigurable assembly line scheduling problem. The dataset supports a study proposing a Q-learning-based multi-objective hyper-heuristic algorithm. It was authored by Haoyi Zhao and last updated on 2026-05-21.
Supplementary file 1_Analysis of rat toxicology studies presents simulation results comparing virtual control groups (VCGs) to concurrent control groups (CCGs) for detecting treatment effects on liver enzymes and bodyweight. The dataset, authored by Guillemette Duchateau-Nguyen and last updated in April 2026, contains results from 100 VCGs generated per reanalyzed study to assess statistical agreement.
1.9 GB of datasets linking epitaxial growth conditions, device geometry, and lasing performance for InP/InAsP multi-quantum well microring lasers. The data includes optical microscopy images, binarized ring images, and field-level statistics for variance-aware optimization. Mihir Rajendra Athavale published this dataset on figshare in 2026.
300 patients with breast cancer and 300 healthy controls were genotyped for polymorphisms in CYP2R1, CYP27B1, CYP24A1, and DBP genes. Laith N. Al-Eitan authored this case-control study, which was last updated on 2026-05-28. The results suggest associations between specific genotypes and altered breast cancer risk.
A prehistoric Italian sample of 155 individuals with gender-specific grave goods was used to develop a Bayesian model for sex assessment in cremation contexts. The model is based on 21 postcranial metric variables and was validated using an independent Austrian prehistoric sample of 45 individuals. The dataset supports the CRISP open-source software and demonstrates the model's 89% accuracy in sex prediction.