Loading...
Loading...
Mathematical datasets, statistical benchmarks, probability, optimization, operations research
2,459 datasets
A simulation experiment from Geoscience Australia compares statistical and mathematical techniques for spatial interpolation of seabed mud content. The study analyzes factors including regions, sample densities, and secondary variables like bathymetry and distance-to-coast to assess prediction accuracy using cross-validation metrics. It identifies a novel combined method, random forest and ordinary kriging (RKrf), as the most robust, achieving up to 17% lower relative mean absolute error than a control method.
Estimated numbers and percentages of prevalent TB cases across urban and rural areas in 26 study countries from 2000 to 2024. Seyed Alireza Mortazavi created this dataset using a Bayesian multivariate regression model to estimate incidence and case detection ratios. The data was last updated in April 2026.
Supplemental material for the 2025 SOUPS paper by Tang et al. contains two files: a list of Symposium on Usable Security and Privacy (SOUPS) papers considered in the study and a table detailing the statistical tests and associated statistics examined. The dataset supports analysis of methodological reporting and interpretation in the usable privacy and security research domain.
A global statistical summary of column aerosol optical depth at 555 nanometers and monthly aerosol type frequency, averaged daily. The National Aeronautics and Space Administration produced this data from the Multi-angle Imaging SpectroRadiometer (MISR) instrument, which uses nine cameras at different angles to achieve global coverage in nine days. Data collection for this product is complete.
234.9 KB of raw experimental data underpins the statistical analysis of tower crane nonlinear swing behavior under stochastic wind excitation. The dataset was authored by Yu Sun and published on figshare in April 2026. It contains the measurements used to model load coupling mechanisms and varying windward pressures on flexible tower jibs.
Final training setups for baseline models compiled by Nadine Francis and uploaded to figshare on 2026-04-09. The 5.5 KB Excel file retains hyperparameters from original papers and marks common defaults. Missing values for optimizer, learning rate, schedule, momentum, or weight decay are indicated.
Chinese undergraduate students from a survey of 825 participants, collected by Jin Zhang. The data supports a serial mediation analysis of core personality traits, fear of missing out, perceived self-efficacy, perceived self-relatedness, and short video dependence under the I-PACE model. The dataset was last updated on March 17, 2026.
Md Mehedi Hasan Santo published a statistical significance summary comparing the MaxGRNet model to baselines on dataset DS1. The summary includes results from 5-fold paired tests with Holm correction for prediction and per-image paired tests for explainability metrics. The dataset was uploaded to figshare on April 8, 2026.
1,000 Chinese mathematical reasoning problems annotated with binary true/false labels and detailed rationales. The dataset was introduced in a research paper and is hosted on Hugging Face by author SallyTan. It was last updated on 2026-04-22.
Government of Ontario data on supervised child visits and exchanges for separated families with safety concerns. The dataset includes statistical reporting from program centres, last updated in March 2026. Row and column counts are unspecified.
Statistical models analyze extreme storm events on Australia's eastern and southern coast, focusing on storm clustering. The analysis uses a 30-year time series of observations, applying a peaks-over-threshold approach with a 2.93-meter wave height threshold. Models include non-homogeneous Poisson processes and copula functions for joint distribution of storm variables.
290 prediabetic subjects aged 20β60 years were assessed for serum vitamin D levels, leukocyte telomere length (LTL), telomerase activity (TA), and genetic polymorphisms in ACTN3, FOXO3A, VDR, SIRT1, and MSTN. The cross-sectional study found a modest positive correlation between vitamin D and LTL and identified significant gene-gene interactions among SIRT1, MSTN, VDR, and FOXO3A. The data was collected and analyzed by Surya Prakash Bhatt and published in 2026.
A Vietnamese dataset for Direct Preference Optimization and Supervised Fine-Tuning designed to train empathetic customer service chatbots. The data was generated using a four-stage pipeline with the Gemini 2.5 Flash model on Vertex AI. It was created by author 'thanhhoangnvbg' and last updated on April 26, 2026.
Simulated biomedical data explores characteristic parameter variability across combinations of atrial fibrosis degree and type. The table provides average and standard deviation values for parameters like EGM duration, deflection count, and peak-peak amplitude. Author Josue Nataren Moran published this synthetic dataset in 2026.
MATLAB-validated models apply Linear Programming and heuristic algorithms like DAOA and GWO to minimize daily operational costs for hybrid renewable microgrids. The dataset, 284,849 bytes in size, contains realistic simulation data for photovoltaic, wind, and battery storage systems. Its findings address cost and reliability challenges in digitized power networks.
A 5.5 KB Excel file containing frequencies of mathematical concepts and expressions used by prospective teachers in the EEPrT and EEPoT assessments. The dataset was authored by Neslihan Usta and last updated on 2026-04-17. It is licensed under CC-BY-4.0 and hosted on figshare.
A dataset listing mathematical concepts or expressions and their frequencies, as used by prospective teachers in the EEPrT and EEPoT. The dataset is 9.5 KB in size, authored by Neslihan Usta, and was last updated on 2026-04 17. It is available under a CC-BY-4.0 license.
A dataset titled 'Statistical Comparison of Dimensional Measurements' is available on figshare. It was authored by Vishal V. Shukla and last updated on April 17, 2026. The dataset is stored in an XLS file format and is licensed under CC-BY-4.0.
113 volatile flavor compounds, including 14 aldehydes, 38 alcohols, 37 esters, 11 ketones, and 13 phenols, were identified in fermented xuecai. The study correlates these compounds with microbial communities like Companilactobacillus and Weissella, finding optimal quality after 32 days of fermentation. It was authored by Mengqin Cao and shared on figshare in 2026.
113 volatile flavor compounds, including 14 aldehydes, 38 alcohols, 37 esters, 11 ketones, and 13 phenols, were identified in fermented xuecai (Brassica juncea). The dataset captures relationships between microbial communities, physicochemical properties, and flavor compounds, revealing optimal quality after 32 days of fermentation. It was created by Mengqin Cao to investigate the microbial-driven mechanism of flavor development.