Loading...
Loading...
Mathematical datasets, statistical benchmarks, probability, optimization, operations research
2,454 datasets
Alberta municipal data compiled from mandatory annual Financial Information Returns (FIR) and Statistical Information Returns (SIR). The Government of Alberta collects standardized summaries of audited financial statements and key municipal statistics from all municipalities. The dataset was last updated in April 2026.
A simulation study by Philippe P. Hujoel, last updated in April 2026, examines the sensitivity of a 3-sigma lnCVR statistical test for detecting simulated fraud in clinical trial data. The 5.5 KB Excel file contains results where the worst 50โ90% of HbA1c scores in an intervention arm were replaced by a best-responder value. The analysis varies the sample size per trial arm to assess the test's detection power.
Philippe P. Hujoel authored a dataset analyzing the sensitivity of a statistical method for detecting simulated fraud in clinical trial data. The dataset, last updated on April 15, 2026, is a 5.5 KB Excel file. It models scenarios where 50% to 90% of the worst HbA1c scores in an intervention arm are replaced by a best-responder value.
Analysis of estuarine sediments from Broad Sound, Queensland, applying Q-mode and R-mode factor analysis, discriminant analysis, and regression. The study identifies geochemical processes controlling concentrations of P2O5, Cu, Pb, and Zn in intertidal and supratidal zones. Research was published by the Australian Ocean Data Network, with a platform record last updated in April 2026.
Statistical data on the greenhouse industry for Ontario and Canada, last updated in April 2026. The dataset includes metrics such as square footage, sales, employee numbers, and selected input costs. It is published by the Government of Ontario.
Statistical significance analysis results from paired t-tests, with p-values indicating significant improvement where values are less than 0.05. The dataset is a 5.5 KB XLS file authored by Jianye Gu and last updated on 2026-05-07. It is shared under a CC-BY-4.0 license on the figshare platform.
Ummer Shakeel published a statistical analysis of model performance on a test set in May 2026. The dataset contains Cohen's Kappa scores for multiple models, stored in a 5.5 KB Excel file. It is licensed for reuse under CC-BY-4.0.
18.9 GB of processed Ethereum blockchain transaction records for ERC20 tokens, supporting a study on statistical patterns. The dataset, created by Kundan mukhia, provides a structured subset of transaction-level data categorized by interaction types. It was last updated in April 2026.
FinProof v1 is an open adversarial benchmark designed to test AI guardrail systems in banking, financial services, and insurance. It covers 7 attack categories across professional and retail conversational registers and was published by Zytra under a CC BY 4.0 license. The dataset was last updated on Hugging Face in May 2026.
Barak Hadad's dataset contains 938.9 MB of processed neural data files required to reproduce results from the study "Auditory network persistence of stimulus representation in awake and naturally sleeping mice." It includes pre-processed neural activity, decoding outputs, and statistical summaries, last updated on April 27, 2026. Raw electrophysiological recordings are not included but are available upon request.
623 stream sites in southwestern Yukon provide geochemical data for 36 elements in sediments, plus water measurements for uranium, fluoride, and pH. This dataset was compiled by the Government of Yukon and published as GSC Open File 2859/EGSD Open File 2001-11(D). The data was last updated on the platform in April 2026.
Beginning in 2006, this dataset tracks monthly counts of approved public assistance applications across New York State's Local Social Services Districts. It provides separate totals for Family Assistance and Safety Net Assistance case openings, mirroring annual statistics published in official state reports. The data is maintained by the New York State Office of Temporary and Disability Assistance.
Nigeria conflict data from 1997 to 2023 combines spatial econometrics with a greed-grievance framework. The analysis includes 14 variables across four types of conflict, using Bayesian spatial methodology and INLA-SPDE techniques. The dataset was authored by Juan Josรฉ Villar-Roldรกn and is hosted on the Political Science Research Methods Dataverse.
Geoscience Australia Data provides a theoretical framework for representing isostatic processes using mathematical filters called admittance functions. The development of cross-spectral techniques relates gravity and topography, with examples for elastic and visco-elastic rheologies. The last update was recorded as 2026-04-20 01:34:17.675553.
Statistical results from datasets covering multiple cell types, including bladder, kidney, frozen and fresh tumor, and mouse cortical cells. The dataset was authored by Xiran Chen and is available under a CC-BY-4.0 license. It was last updated on May 6, 2026.
Feng Li's dataset provides statistical evaluation metrics for models simulating soil moisture and salinity. The dataset is stored in an XLS file with a size of 5.5 KB and was last updated on May 6, 2026. It is licensed under CC-BY-4.0 and hosted on the figshare platform.
62,555 dementia cases and 312,772 matched controls from nationwide Finnish health registries were analyzed to examine the role of 29 hospital-treated diseases in the association between severe infections and dementia. The study, authored by Pyry N. Sipilรค and published on figshare, identified that the increased dementia risk from two infectious diseases was not attributable to 27 other comorbid conditions. The dataset, last updated in March 2026, contains the statistical codes used for this analysis.
520,000 error traces document models' mistakes during math problem synthesis. The dataset includes updates from March 2026, where 50,000 new datapoints were added and 10,000 older ones were replaced with higher-quality synthetic questions verified by a 12-consensus tool. It was authored by nguyen599 and last updated on Hugging Face in May 2026.
Simone Dilaria's dataset contains the results of a Linear Discriminant Analysis (LDA) performed on volcanic clast samples. The analysis probabilistically assigns samples to sources between the Villa Draghi and Via Scagliara di M. Castellone quarries, using a discriminant function considered statistically significant at a 95% confidence level (p-value < 0.05). The dataset was last updated on April 13, 2026.
Linear Discriminant Analysis results probabilistically assigning volcanic clast samples to sources between the Villa Draghi and Via Scagliara di M. Castellone quarries. The dataset, created by Simone Dilaria, is a 5.5 KB Excel file last updated in April 2026. Discriminant functions with a p-value < 0.05 are considered statistically significant at the 95% confidence level.