Loading...
Loading...
Mathematical datasets, statistical benchmarks, probability, optimization, operations research
2,470 datasets
An R package for statistical procedures in agricultural research, originally presented in a Master's thesis at the National Engineering University (UNI) in Lima, Peru. It offers functionality for planning experimental designs like lattice, Alpha, Cyclic, and factorial designs. The package also provides analysis facilities for treatment comparisons, non-parametric tests, and biodiversity indexes.
Research data from the National Oceanic and Atmospheric Administration focusing on the development of optimal grow-out diets for sablefish (Anoplopoma fimbria). The study uses a novel statistical mixture model and response surface analysis to test the effects of dietary protein, lipid, and digestible carbohydrate on fish growth and feed conversion efficiency. Fish in the experiments may be PIT tagged and regularly measured for length and weight.
Research data from the National Oceanic and Atmospheric Administration focuses on optimizing dietary protein, lipid, and carbohydrate levels for sablefish aquaculture. The study uses a novel statistical mixture model and response surface analysis to test commercially viable feed formulations. Raw data on rearing densities, tank conditions, water temperature, mortalities, ration, and feed size may be available.
ColQwen3.5 Optimization Trail contains 776+ MTEB evaluation results from the development of three visual document retrieval models. The dataset, published by athrael-soju, captures the full development process including seeds, ablations, and variants for models using ColBERT-style late interaction with Qwen3.5-VL.
GLC_FCS30D is a global land-cover monitoring product developed by Liangyun Liu of the Chinese Academy of Sciences. It provides 30-meter resolution data for 35 land-cover categories from 1985 to 2022, with updates every 5 years before 2000 and annually thereafter. The product was validated to achieve an overall accuracy of 80.88% for a 10-category system and 73.24% for a 17-category system.
META-R is a set of R programs for statistical analysis of plant breeding trials. It calculates BLUEs, BLUPs, genetic correlations, and broad-sense heritability, and can generate boxplots and histograms. The software was authored by Gregorio Alvarado and includes a graphical Java interface for user interaction.
Seven types of evidence indicate that subjective well-being contributes to better health and longevity. The review by Ed Diener of the University of Illinois Urbana-Champaign synthesizes prospective longitudinal studies, experimental research, and naturalistic studies. It discusses causality, effect sizes, and the controversial link between well-being and longevity in populations with certain diseases.
2012-2023 country-year data on tuberculosis treatment outcomes, processed for Bayesian model comparison. The dataset originates from the World Health Organization and covers global TB treatment success rates. Its specific use case is for comparing statistical models.
Ukrainian state statistical observations are described in this metadata collection. The dataset is provided by the States site of Ukraine and was last updated on March 5, 2026. It is available in JSON and Excel XLSX formats.
Chun-Hui He's research paper presents a mathematical model for the Fangzhu, an ancient Chinese device for collecting water from air. The model elucidates the device's possible surface-geometric properties and identifies key factors affecting its effectiveness. The dataset likely contains parameters and results from this mathematical analysis of the hydrophilic-hydrophobic hierarchical surface.
A dataset from OpenML by Varsha Pandey concerning a premium club's customer membership. The description indicates the club has faced significant membership cancellations in recent years and aims to use statistical methods to identify at-risk customers. The dataset's specific scale, such as row count and column details, is unknown.
A balanced dataset for predicting customer churn in the European telecommunications sector. The data was used as a pre-qualification challenge for the 2018 Data Science Nigeria bootcamp and hackathon. It is described as an academic dataset intended to help participants classify potential customers who might leave their service provider.
A dataset from openml by VARSHA PANDEY concerning a premium club's customer membership. The description indicates the club has faced membership cancellations in recent years and aims to use statistical methods to identify at-risk customers. The dataset's specific size, time range, and geographic scope are not detailed.
A balanced dataset was used to help participants classify potential customers who might churn. This challenge was part of the pre-qualification for the 2018 Data Science Nigeria all-expense paid bootcamp and hackathon scheduled for October 10-15, 2018. The data is described as academic and focuses on the business problem of customer retention in telecommunications.
The Mallard Model is a stochastic computer model from CEOS_EXTRA, hosted on NASA EarthData. It is designed to aid waterfowl managers in predicting outcomes for various management scenarios to maximize mallard and upland nesting waterfowl productivity. The dataset's specific size, format, and update history are not provided.
Deschampsia antarctica, one of two native flowering plants in Antarctica, is the subject of this biochemical study. The data includes analyses of leaf biochemistry, lipids via HPLC, antioxidants, and statistical comparisons with related species. It was produced by the organization SCIOPS and sourced from NASA's Earthdata platform.
The M2M Thematic Programme, funded by the British Geological Survey, produced data from 17 scientific investigation projects on fluid flow in heterogeneous rock. Research focused on scaling relationships, quantification of flow properties, statistical models, and rock-flow interactions across spatial and temporal scales.
A project dataset from the UKCCSRC Call 2 grant (UKCCSRC-C2-218) focused on developing a cutting-edge CO2 flow measurement system for carbon capture and storage pipelines. The British Geological Survey led research incorporating multi-modal sensing and statistical data fusion techniques, including sensors for differential pressure, ultrasonic, Coriolis, temperature, pressure, and electrical impedance. Experimental work tested the system under controlled conditions resembling practical CCS operations.
A 37-day sub-seabed CO2 release experiment assessed impacts on benthic macrofauna across four sampling zones. The study, published in the International Journal of Greenhouse Gas Control, documents rapid community changes during the leak and recovery 18 days post-injection. Data includes macrofaunal community structure and diversity metrics from zones at 0m, 25m, 75m, and 450m from the leak center.
Ardmucknish Bay on the Scottish west coast was the site of a controlled sub-seabed CO2 release experiment from May to October 2012. The study deployed three pCO2 sensor technologies alongside instruments measuring oxygen, temperature, salinity, and currents to monitor leakage. Researchers used a multivariate statistical approach to distinguish natural forcing from CO2 release signals.