Loading...
Loading...
Mathematical datasets, statistical benchmarks, probability, optimization, operations research
2,446 datasets
Kinematic data derived from high-speed video recordings of rats reaching for pellets. The dataset includes raw input data, processed output data, and statistical values and tables, created by Andrew Spence and last updated in May 2026. It was collected to study descending brain circuit remodeling after injury to inform better treatments.
Geoscience Australia Data assessed sediment distribution in Keppel Bay, a macrotidal environment interfacing the Fitzroy River catchment with the Great Barrier Reef shelf. The study classified seabed sediments into five distinct classes using sediment sampling, acoustic seabed mapping, and statistical techniques. The dataset was last updated on 2026-04-30.
Novel high-throughput experimentation sampling strategies require only 25% of the experiments compared to full factorial designs. Vincent Porte authored this dataset, which applies frugal sampling to four challenging metal-catalyzed cross-coupling reactions. The dataset was last updated on April 10, 2026.
A mathematical analysis of forest structural dynamics by Jian Zhou of Peking University. It demonstrates the significant impact of tree growth variance on forest size structure predictions, resolving a mismatch between observation and prior theory. The work identifies an asymptotically power-law relationship between tree size and growth rate variance.
Daniel T. Webb published a dataset on figshare in 2026 describing first-in-class Activin receptor-like kinase 2 (ALK2) degraders. The data includes information on compounds M4K3233 and M4K3250, developed as chemical tools for studying ALK2 degradation in diseases like fibrodysplasia ossificans progressiva and glioblastoma. The dataset is 1.6 KB in size and is available in CSV format.
18 spatial interpolation methods were compared for predicting seabed sand content within the Australian Exclusive Economic Zone (AEEZ). The study, using samples from Geoscience Australia's Marine Samples Database extracted in August 2010, found RFIDS and RFOK were among the most accurate methods across three tested regions. Model averaging further improved prediction accuracy, with the most accurate methods reducing error by up to 7%.
A dataset related to the Clusterpath estimator of the Gaussian Graphical Model (CGGM), a method for variable clustering in graphical models. The dataset, authored by D.J.W. Touw and last updated on 2026-04-09, includes files such as TXT, ODS, PDF, R, MD, CSV, GZ, RDATA, GITIGNORE, and RPROJ, totaling 101.0 MB. It supports a convex optimization approach that encourages block-structured precision and covariance matrices.
GeoTASO instrument data from the NASA B200 aircraft provides nitrogen dioxide (NO2) and formaldehyde (HCHO) trace gas slant column measurements over South Korea. This dataset was collected during the May-June 2016 KORUS-AQ field campaign, a joint study by NASA and Korea's National Institute of Environmental Research. It features coordinated airborne sampling up to 8 km altitude, focusing on the Seoul Metropolitan Area and surrounding waters.
A simulation experiment compares statistical and mathematical techniques for interpolating seabed mud content. The study, conducted by Geoscience Australia and published via the Australian Ocean Data Network, evaluates methods like random forest and ordinary kriging using cross-validation metrics. It was last updated in April 2026.
A figshare dataset by Xue Yuan, last updated April 2026, containing molecular structure data for a series of colony-stimulating factor 1 receptor (CSF1R) inhibitors. The data, stored in PDB format, relates to compounds like C52, which were designed for treating acetaminophen-induced acute liver injury. The dataset size is 365.9 KB.
Yu Zhang published a dataset on figshare in April 2026 detailing the design and optimization of a novel series of pyridazinone-based MAT2A inhibitors. The data likely contains results from structure-activity relationship studies, including IC50 and GI50 values for lead compounds. The dataset is 3.0 KB in size and is available in CSV format.
A methodological paper and associated data propose a novel approach for summarizing posterior inference in nonparametric Bayesian mixture models. The work, authored by Khai Nguyen and available on figshare, was last updated in April 2026. It introduces two variants of sliced Wasserstein distance for Gaussian mixtures to estimate mixing measures and validate partitions.
An anonymized and cleaned dataset used for a statistical analysis study. The data includes all variables collected from survey responses and is provided as a 17.4 KB XLSX file. It was authored by Abdullrahman Mohammed Alshehri and last updated on May 21, 2026.
79.9 KB of tabular data from figshare, authored by Ilju Yang and last updated on 2026-04-13. The dataset contains measurements of male damselfly morphology, including tibia area and abdomen length, alongside daily pairing success records and environmental variables like temperature and UV level.
Fit of the elements in the final model predicting effects of pH and salinity under different temperatures and moisture levels. The dataset, authored by Kanokporn Chaianunporn, is a 5.5 KB XLS file last updated on 2026-05-18. Asterisks in the data indicate statistical significance at Îą = 0.05.
Fit statistics for a final model predicting the effects of soil iron content, salinity, temperature, and moisture levels. The dataset includes indicators for statistical significance at Îą = 0.05. Authored by Kanokporn Chaianunporn and last updated on 2026-05-18.
9.5 KB of statistical model fit data for predicting the effects of soil carbon-to-nitrogen ratio and salinity under different temperature and moisture levels. The dataset was authored by Kanokporn Chaianunporn and last updated on May 18, 2026. Asterisks in the data indicate statistical significance at Îą = 0.05.
Pragya Singhal published a dataset on figshare containing psychological health and WHOQOL scores. The data includes mean and standard deviation scores for Physical, Psychological, Social, and Environment domains across three groups: Cases, caregivers, and college students. It also contains Kruskal Wallis statistical values for comparing these groups.
24 prioritized derivatives of the CHI3L1-binding molecule G721-0282 were generated through virtual screening and structure-guided optimization. The dataset includes biophysical analysis identifying lead compound G721-0377, which has a binding affinity (Kd) of 45 ΞM. The data was authored by Baljit Kaur and last updated on 2026-04-28.
Structure-activity relationship data for 24 small-molecule derivatives of the CHI3L1-binding compound G721-0282, optimized for Alzheimer's disease research. The dataset includes biophysical binding affinity measurements, with the lead compound G721-0377 showing a Kd of 45 ΞM. It was created by Baljit Kaur and published on figshare in April 2026.