Loading...
Loading...
Mathematical datasets, statistical benchmarks, probability, optimization, operations research
2,446 datasets
Files S1–2 contain input and output files for total-evidence Bayesian phylogenetic analyses of hamsters. The dataset includes all necessary information to reproduce the analyses, as well as resulting Bayes factors from stepping-stone sampling and harmonic means. It was authored by Moritz Dirnberger and published on figshare under a CC-BY-4.0 license.
5.5 KB of model parameters from the FoMo generative model for analyzing sequential decision-making in visual foraging tasks. The dataset, authored by Alasdair D. F. Clarke and last updated in May 2026, likely contains interpretable parameters for predicting target-by-target selection behavior. It builds upon a published PLOS Computational Biology paper to incorporate spatial structure into foraging analysis.
A meta-analysis of ten randomized controlled trials assesses Ginkgo biloba extract's efficacy and safety for hyperlipidemia. The analysis, sourced from PubMed, Wanfang, and CNKI up to December 31, 2024, compares effects on cholesterol and triglycerides against Western and Chinese medicine. Results include statistical significance for HDL improvement and age- and dose-specific effects.
Jane COZETTE's dataset contains camera trap data for analyzing seasonal and diel activity patterns of Tenrec ecaudatus in Réunion Island. Data were collected across three contrasting habitats between August 2019 and July 2022, processed into monthly and hourly independent detection events. The dataset includes relative activity indices and environmental covariates like temperature and rainfall.
Teddy Lazebnik published a list of model variables for simulating ischemic dermal wound healing in May 2026. The dataset likely contains parameters for two mathematical models—a PDE and an Agent-based Simulation—used to assess oxygen therapy effectiveness. The models focus on keratinocytes, which constitute 90% of epidermal cells.
Sen Wang published a dataset on figshare in May 2026 detailing the isolation and characterization of a high-DHA mutant strain. The data likely contains results from a high-throughput Raman-activated cell sorting screen of 50,000 mutant cells and subsequent omics and fermentation analysis. It describes strain ABBS26, which achieved a final biomass of 147.3 g/L and DHA production of 47.4 g/L.
Giovanni Puccetti's pedagogical article examines common misconceptions about correlation in financial risk modeling. The work uses simplified examples and reproducible R code to demonstrate how misjudging statistical dependence can amplify systemic vulnerabilities, as seen in the misuse of the Gaussian copula during the subprime crisis. It aims to foster critical awareness of quantitative tools among risk managers, regulators, and students.
VIIRS/NOAA20 Deep Blue Level 3 monthly aerosol data provides a 1x1 degree gridded product derived from daily satellite observations. The dataset contains 45 Science Data Set layers in netCDF format, with monthly aggregation requiring at least 3 valid days of data per grid element. Its record starts from January 5th, 2018.
445.0 MB of data from a study combining multisource soil data, input-output analysis, and geospatial modelling to examine inequality in soil arsenic pollution burdens across China's provinces from 2000 to 2020. The work, authored by Ziyang Li and shared on figshare, shows the inequality index in recipient provinces rose 4.6-fold to 0.46, while sending provinces' index declined. Scenario simulations indicate land-use optimization could reduce overall inequality by about 25%.
5.5 KB of statistical model summaries for seven upper limb muscles, including m. brachioradialis and m. biceps brachii. The dataset, authored by Frida Torell and last updated in June 2026, contains results from multiple linear regression (MLR) models. It includes metrics like R² and Q², with a note on statistical significance thresholds.
Glenn T. Schumacher published a dataset on figshare on 2026-05-15 summarizing Bayesian hierarchical linear regression metrics from a stable isotope analysis study. The 5.5 KB XLS file contains results from an analysis of ontogenetic niche shifts in Arctic Charr and Brook Trout populations in four lakes in Maine, USA. The study used eye lens stable isotope analysis to investigate trophic position changes through life.
A de-identified clinical dataset collected for a study on Non-Invasive Prenatal Testing (NIPT). It includes maternal clinical information, gestational age, cell-free fetal DNA indicators, prenatal screening records, and confirmed fetal karyotype results for trisomies 21, 18, and 13. The data was used to construct multi-model algorithms for optimizing NIPT timing and predicting fetal chromosomal abnormality risk, matching a manuscript submitted to BMC Pregnancy and Childbirth.
Archived website analytics for the Lincolnshire Open Data portal, split into three resources covering overall usage, summaries, and individual webpage statistics. The data was collected by the Government Digital Service and is licensed under the Open Government Licence. It specifically excludes API calls and includes only UK-based user traffic.
A figshare case report authored by Pradeep M. K. Nair describes a 75-year-old male patient with advanced muscle-invasive urothelial carcinoma. The report details an integrative oncology protocol combining metabolic therapies, oncothermia, high-dose vitamin C, nutraceuticals, acupuncture, laser therapy, diet, and yoga, followed by oral chemotherapy. The patient's response, including biomarker normalization and improved quality of life, is documented up to May 2026.
Tommaso Costa authored a 17.2 KB document introducing the Bayesian audit, a conceptual framework for evaluating the proportionality of scientific claims to evidence. The framework is applied to a case study re-evaluating the 1996 "elderly priming" study by Bargh et al. The document was uploaded to figshare on 2026-05-22 and is licensed under CC-BY-4.0.
ALPARC is an open-source Python toolbox developed by Lorenzo Titone that generates artificial language syllable streams. The tool creates pseudoword stimuli tailored to the statistics of real languages while controlling for acoustic and phonological rhythmicity confounds. The associated PDF file, last updated in May 2026, describes the tool's functionalities and demonstrates its application for statistical learning studies.
Sara L. McCormack published a dataset on figshare in 2026 for predicting diastereoselectivity in crystallization-induced diastereomer transformations (CIDTs). The data supports a workflow to featurize product structures and construct statistical models, validated on six previously untested substrates. The 3.0 MB dataset is provided in XLSX format.
A 786.4 KB dataset from figshare contains features and models for predicting diastereoselectivity in crystallization-induced diastereomer transformations (CIDTs). The workflow by Sara L. McCormack featurizes product structures to build statistical models, validated on six previously untested substrates. Feature analysis identifies amide identity, conformational compactness, and local electronic properties as key determinants of stereochemical outcome.
1.7 KB of production C++ source code implements a bounded dynamic range sensor clamper for telemetry pipelines. The module enforces physical boundaries at the edge to drop or clamp mathematically impossible sensor anomalies before they pass to downstream estimation filters. Authored by Jamie Davis and released under CC BY 4.0 in May 2026.
A 1.8 MB dataset from figshare, last updated May 2026, accompanies a paper by Claudio Del Sole on competing risks survival analysis. It supports a Bayesian nonparametric model for predicting event types and estimating survival functions. The paper includes simulation studies and an application to clinical datasets.