Loading...
Loading...
Mathematical datasets, statistical benchmarks, probability, optimization, operations research
2,449 datasets
A small dataset from figshare details thresholds for post-hoc filtering in Bayesian models and counts cases categorized as ambiguous, which were excluded from successful sex estimation. The dataset, authored by Lukas Waltenberger, is a 5.5 KB XLS file last updated in May 2026. It is shared under a CC-BY-4.0 license.
A 2012 pilot study by Kent and Essex IFCA tested commercial whelk pots with varying drain hole sizes to reduce undersized catch. Pots were deployed in strings of ten for two 48-hour soak periods, and the captured whelks were measured along their widest and shortest axes. The results indicated a small positive correlation between hole size and the shell height of the whelks retained.
729 experimental observations from 81 treatment combinations for plant tissue culture media optimization. The dataset contains 4 input variables and 4 response variables related to plant growth. It was authored by Hans Bethge and last updated on 2026-04-30.
Individual data points used for statistical analysis of catheter tip locations and operator performance. The dataset is an Excel file (XLSX) with a size of 32.6 KB, authored by Jung Won Kwak and last updated on May 14, 2026. It is shared under a CC-BY-4.0 license on the figshare platform.
speciateIT vSpeciateDB Models are compressed directories containing Markov chain models for classifying rRNA gene amplicon sequences. The models target the V1V3, V3V4, or V4 regions from the human vaginal microbiota. The dataset is 643.3 MB, authored by Johanna Holm, and was last updated on May 14, 2026.
Q-mode factor analysis classified estuarine sediment samples from Broad Sound, Queensland, into two geologically distinct groups representing intertidal and supratidal deposition. The dataset, hosted by the Australian Ocean Data Network, contains results from statistical and mathematical analyses identifying processes controlling concentrations of P2O5, Cu, Pb, and Zn. Last updated on 2026-04-10, the data explores trace element behavior under varying pH and Eh conditions.
Emily Tufano's dataset provides the statistical source data and outputs for figures in a preclinical study. The 112.2 KB Excel file contains statistical test details, exact p-values, groupwise n, and descriptive statistics for each figure panel. It was last updated on April 20, 2026.
Wilcoxon rank-sum test results comparing micro-CT-scanned and unscanned petrous bones across key molecular parameters. The dataset includes results for sample sizes of 50 scanned and 43 non-scanned bones for most parameters, and 13 scanned versus 18 non-scanned for nuclear contamination in XY individuals. It was authored by Lumila Paula Menéndez and last updated on figshare in April 2026.
5,000 high-quality reasoning examples designed to mirror the chain-of-thought reasoning traces of Kimi 2.6 Thinking. The dataset was created by author gss1147 and last updated on May 25, 2026. Each example includes an instruction, a step-by-step reasoning trace, a final response, and domain classification.
Raw and processed data supporting a study on the antibacterial activity of actinomycetes isolated from Nepalese soil. The dataset was generated by Sudesh Kharel as part of a project on fermentation optimization and is available under a CC-BY-4.0 license. It includes two Excel files, a Word document, and a README file explaining variables and equations.
A retrospective study of 801 patients with unresectable locally advanced esophageal squamous cell carcinoma treated at a Chinese hospital from 2014 to 2023. The research quantifies statistical cure using a mixture cure model, estimating a 10.4% cure fraction and a 6.7-year cure point, with external validation on 5,000 matched patients from the SEER database.
A mathematical model from Jiangsu University investigates HIV-1 infection dynamics in CD4+ T-cells under antiviral therapy. The research applies three computational schemes—the modified Khater method, extended simplest equation method, and sech–tanh method—using the Atangana–Baleanu fractional derivative operator to find analytical wave solutions. The stability of solutions is analyzed via Hamiltonian system characteristics, and results are compared to prior work.
A linear dynamic range from 10 mg L-1 to 100 mg L-1 for protein detection is reported. The dataset describes a fluorescent optrode sensor based on a diketopyrrolopyrrole derivative, tested for total protein determination in urine samples from healthy individuals. Results were validated against a Ponceau-S/TCA spectrophotometric method by Marta Veríssimo from the University of Aveiro.
A working paper from the Reality Drift series published in 2025. The document examines the 'Optimization Trap,' a condition where efforts to improve measurable performance interfere with a system's underlying functions. It draws on examples from loyalty programs, staffing models, and content systems to analyze trade-offs between efficiency and system fidelity.
Open data and interactive visualizations from the UK Office for National Statistics. The data explorers and calculators are created by ONS statisticians and digital specialists to make government data relevant to everyday life. The dataset is published under an Open Access (diamond) license.
A research paper proposes the Optimized Distance Range Free (ODR) algorithm for node localization in wireless sensor networks. The algorithm modifies the DV-Hop method by rectifying hop size errors and using linear optimization to improve accuracy without extra hardware or communication overhead. The work is authored by Sumit Kumar of Maharishi Markandeshwar University and is available via Open Access.
Compound 65, a pyrimido[4,5-b]indole derivative, demonstrated potent broad-spectrum activity against multidrug-resistant Gram-negative bacteria without detectable hERG liability. The dataset contains results from a structure-based design campaign to overcome pharmacokinetic limitations of earlier compounds. Yuzhi Liu published this data on figshare in April 2026.
An R script used to perform all statistical analyses of choice latency reported in the supplementary materials for a study on temporal consistency of judgement biases in bumblebees. The analyses were conducted on the full dataset without excluding values greater than 1.5 times the interquartile range. The script was authored by Luigi Baciadonna and is available under a CC-BY-4.0 license.
Geoscience Australia's Marine Samples Database provided data for a 2010 study comparing methods to predict seabed sand content across the Australian Exclusive Economic Zone (AEEZ). The research evaluated 18 spatial interpolation and machine learning methods, including RFIDS and RFOK, across three distinct regions. Model averaging was found to improve prediction accuracy, with the best methods reducing error by up to 7%.
9.5 KB of summary data comparing RL-DE and baseline algorithms on statistical significance at the function level. The dataset, authored by Yang Cao, is available in XLS format and was last updated on May 13, 2026. It is shared under a CC-BY-4.0 license on figshare.