Loading...
Loading...
Mathematical datasets, statistical benchmarks, probability, optimization, operations research
2,485 datasets
An R package implementing Bayesian treed Gaussian process models for nonstationary, semiparametric nonlinear regression and experimental design. The software includes special cases like Bayesian linear models, CART, and stationary GPs, and provides functions for visualization, sensitivity analysis, and sequential design. It was authored by Robert B. Gramacy, with methodology detailed in the Journal of Statistical Software in 2007 and 2010.
stuart is a dataset or software output for constructing subtests from a pool of items using algorithmic methods like ant-colony-optimization, genetic algorithms, brute force, or random sampling. It is associated with the work of author Martin Schultze and a 2017 publication. The specific data content, scale, and structure are not detailed in the provided metadata.
blme is an R package for maximum a posteriori estimation of linear and generalized linear mixed-effects models in a Bayesian framework. It implements the methods of Chung et al. (2013) and extends the popular 'lme4' package by Bates et al. (2015). The package was authored by Vincent Dorie.
correlation is a lightweight R package for computing various correlation measures, developed by Dominique Makowski. It is part of the 'easystats' ecosystem and includes methods for partial, Bayesian, multilevel, polychoric, biweight, and distance correlations. The package is referenced in a 2020 paper published in the Journal of Open Source Software.
Bayesian Additive Regression Trees (BART) provide flexible nonparametric modeling of covariates for continuous, binary, categorical and time-to-event outcomes. The method is described in a paper authored by Robert McCulloch and others. Implementation details and data are sourced from the paperswithcode platform.
mcmcse is a software package developed by James M. Flegal for computing Monte Carlo standard errors (MCSE) in Markov chain Monte Carlo (MCMC) settings. The package supports MCSE computation for expectation and quantile estimators as well as multivariate estimations. It also provides functions for computing effective sample size and for plotting Monte Carlo estimates versus sample size.
Hierarchical Modelling of Species Communities (HMSC) is a model-based approach for analyzing community ecological data. The package implements this approach within a Bayesian framework using Gibbs Markov chain Monte Carlo (MCMC) sampling, as described by Gleb Tikhonov et al. in a 2020 publication. The dataset likely contains tabular data representing species communities and associated environmental variables for statistical modeling.
A software package for performing network meta-analyses within a Bayesian framework using JAGS. It includes methods for assessing heterogeneity and inconsistency, and provides standard visualizations for results. The package was developed by Gert van Valkenhoef and Joel Kuiper, with key publications in 2012 and 2015.
A software companion dataset for the book 'Model-based Geostatistics' by Diggle and Ribeiro (2007). It supports geostatistical analysis using variogram-based, likelihood-based, and Bayesian methods. The dataset's author is Paulo Justiniano Ribeiro Jr.
Denver, Colorado neighborhood boundaries published by data.colorado.gov. The dataset includes polygon geometry and attributes for neighborhood names, IDs, and typology classifications. It was last updated on 2026-02-19 00:41:14.
Customer churn prediction and purchase probability estimation are the core tasks enabled by this dataset. The data originates from Kaggle, but its author, organization, and specific size are unknown. Its last update date is also unspecified.
Christian Kroer's dataset contains all generated notifications for a subset of Instagram users across four notification types within a specific time window. It was collected during an A/B test comparing first-price and second-price auction systems for notification optimization.
Greater London boundary files from the Greater London Authority include Output Areas (OA), Lower/Middle Super Output Areas (LSOA/MSOA), Wards, and Boroughs for 2004, 2011, 2014, and 2018. The data is provided in ESRI and MapInfo formats, with OA to MSOA boundaries generalized to reduce file size. Copyright statements for National Statistics and Ordnance Survey data are required for maps created using these boundaries.
Kilem L. Gwet's irrCAC provides calculations for various chance-corrected agreement coefficients used in inter-rater reliability studies. The package includes Cohen's kappa, Fleiss' kappa, Krippendorff's alpha, and other coefficients, as detailed in Gwet's 2014 handbook. It supports analyses for two or more raters and offers multiple weighting schemes for weighted analyses.
AGD-R is a set of R programs for statistical analysis of genetic experiments, created by Francisco Rodríguez. It performs calculations for Diallel, Line by Tester, and North Carolina mating designs to assess genetic and environmental contributions to quantitative traits. The software includes a graphical Java interface for user input and analysis selection.
A statistical software package for analyzing genotype-by-environment interactions in agricultural trials. The tool implements methods like AMMI, SREG, PLS, and factorial regression to identify superior genotypes for specific environments. It was authored by Ángela Pacheco and is hosted on the paperswithcode platform.
A dataset titled 'Esperanto Arithmetic Cot' was published on the Hugging Face platform by author jensjepsen. The dataset was last updated on 2026-04-10. Its specific content and structure are not detailed in the provided metadata.
Bundesamt für Kartographie und Geodäsie provides a heavy rain hazard map representing an event statistically expected once every 100 years. The map marks areas with the expected water level in centimeters for a rain duration of 60 minutes. The dataset was last updated on 2026-02-17.
A Bayesian network with 76 nodes and 112 arcs, containing 574 parameters. The structure, with an average Markov blanket size of 5.92 and a maximum in-degree of 7, is a discrete, large-scale model from the bnlearn repository. It is hosted on OpenML and is in the public domain in the United States.
A sample from the Win95pts Bayesian Network, a discrete probabilistic model with 76 nodes and 112 arcs. The network contains 574 parameters and is referenced in the bnlearn Bayesian Network Repository. Its license is listed as us-pd.