Loading...
Loading...
Mathematical datasets, statistical benchmarks, probability, optimization, operations research
2,452 datasets
A working paper record from the Reality Drift Archive, last updated on 2026-04-26, introduces a set of conceptual terms for analyzing digitally mediated environments. It groups terms like Filter Fatigue, Synthetic Realness, and the Optimization Trap to describe patterns in algorithmic mediation and cultural systems. The document is retained as an exploratory archive for descriptive and comparative purposes.
Sediment sources to the Fitzroy River coastal zone have been identified and quantified using an integrated geochemical and modelling approach. Geochemical data indicate a sediment composition consistent with derivation from mixed catchment sources, with a Bayesian statistical model revealing changes in catchment sediment sources over time. The dataset is provided by the Australian Ocean Data Network and was last updated on 2026-04-16.
Fisheries and Oceans Canada provides yearly spawning stock biomass estimates for Atlantic Cod in the southern Gulf of St. Lawrence (NAFO 4T-4Vn) from November to April. The data, produced via a Statistical Catch-at-Age model and Markov Chain Monte Carlo simulations, includes median estimates and uncertainty percentiles (2.5th, 25th, 75th, 97.5th) measured in thousands of tons. These estimates support stock assessment and fisheries management decisions.
A 30-year timeseries of observations from the eastern and southern coast of Australia was used to extract independent storm events. The dataset contains multivariate summary statistics for extreme storm events, including maximum significant wave height, wave period, direction, duration, peak storm surge, and time of occurrence. This data was produced by Geoscience Australia as part of the Bushfire and Natural Hazards CRC Project, with preliminary results from a study site on the central coast of New South Wales.
A comparison of Remaining Useful Life prediction performance between a proposed framework and representative deep learning models under sensor fault conditions. The dataset was authored by Dongdong Tang and last updated on 2026-04-29. Performance is evaluated using RMSE, the standard C-MAPSS scoring function, and statistical significance.
Dongdong Tang published a 5.5 KB Excel file on figshare in April 2026. The dataset compares the performance of a proposed framework against representative deep learning models for predicting Remaining Useful Life (RUL) under sensor fault conditions. Evaluation metrics include RMSE, the standard C-MAPSS scoring function, and statistical significance.
A dataset discloses lead optimization results for a series of METTL3 inhibitors, culminating in the discovery of compound EP102. The data, published by Guillaume Dutheuil on figshare in April 2026, includes information on structural modifications that improved oral bioavailability and decreased lipophilicity. EP102 has demonstrated efficacy in mouse tumor models and has entered clinical development for advanced solid tumors.
Raw data used for statistical analysis in the research by Kisło et al, published in 2026. The dataset is 38.5 KB in size and is available in CSV format under a CC-BY-4.0 license. It was last updated on April 28, 2026.
20,000 training and 1,000 test trajectories for each of three partial differential equations (Navier-Stokes, Kuramoto-Sivashinsky, Gray-Scott) simulated via Exponax. The dataset was created by author 'yijingz' for the paper 'Breakeven complexity: A new perspective on neural partial differential equation solvers' and was last updated on Hugging Face in May 2026.
Ontario's Office of the Registrar General provides annual counts of completed proof of registration requests from 2010 onward. Data includes requests for certificates like birth or marriage records, covering both residents and non-residents. The Government of Ontario releases this information under the Vital Statistics Act.
SRA-Bench is a benchmark dataset for evaluating skill-retrieval-augmented large language model agents, created by WeihangSu and last updated on April 22, 2026. It contains 5,400 test instances and a skill library of 26,262 skills, of which 636 are gold skills and 25,626 are web-collected distractors. The dataset includes sub-benchmarks like TheoremQA and LogicBench for specific reasoning tasks.
Wanglanhuajiaofen annotated the MS MARCO dataset for utility using the Qwen3-32B model. The dataset supports research on multi-positive optimization objectives for dense retrieval. It was last updated on 2026-05-19.
Robert P. Law published a dataset on figshare in 2026 detailing a pilot project for automated molecular design. The 5.8 KB CSV file documents the application of the BRADSHAW platform to optimize inhibitors of Endoplasmic Reticulum Aminopeptidase 1 (ERAP1), a target in cancer immunotherapy and autoimmune diseases. The work involved four iterations of generative design, machine learning model refinement, and multiparameter optimization.
A dataset summarizing treatment sessions, likely for medical or behavioral interventions. It was authored by Yucheng Li and last updated on May 8, 2026. The data is stored in a RAR archive with a size of 247.7 KB.
A 5.5 KB Excel file containing p-values for statistical significance analysis. The data compares a specific approach against state-of-the-art methods across three distinct experimental scenarios. Authored by Mohammadamin Moragheb and last updated on 2026-05-08, it is shared under a CC-BY-4.0 license.
5.5 KB Excel file from figshare contains statistical parameters and model adequacy metrics for factorial design. Author Nilima Thombre last updated the dataset on 2026-05-08. The data likely supports analysis of experimental design outcomes.
Compound 65, a pyrimido[4,5-b]indole derivative, demonstrated potent broad-spectrum activity against multidrug-resistant Gram-negative bacteria without detectable hERG liability. The dataset contains results from a structure-based design campaign to overcome pharmacokinetic limitations of earlier compounds. Yuzhi Liu published this data on figshare in April 2026.
Published in 2026 by Yuzhi Liu, this dataset contains 3D molecular structure files for a series of pyrimido[4,5‑b]indole derivatives designed to combat multidrug-resistant Gram-negative bacteria. It includes the structure of the lead compound, designated 65, which demonstrated potent antibacterial activity and improved pharmacokinetic properties.
A research paper proposes a method for managing transmission line congestion in electricity markets. The method uses line flow sensitivity factors and particle swarm optimization to find the optimal location and parameter setting for a Thyristor Controlled Series Capacitor (TCSC). The proposed method is tested on the IEEE 30-bus system, IEEE 118-bus system, and a 33-bus Indian network.
204 compounds from the Natural Product Set IV of the National Cancer Institute Developmental Therapeutics Program were screened for activity against the Zika virus RNA-dependent RNA polymerase. The dataset, authored by Vanessa Aitken and last updated in April 2026, contains results from an optimized malachite green colorimetric assay. It identifies two preliminary modulator compounds, purpurogallin and digallic acid.