Loading...
Loading...
Mathematical datasets, statistical benchmarks, probability, optimization, operations research
2,486 datasets
Ads CTR Optimization is a dataset hosted on Kaggle, focused on click-through rate prediction in digital advertising. The dataset likely contains features for modeling user engagement with ads, though specific column details and volume are unconfirmed. Its source and last update information are not provided in the available metadata.
Mathematical problems and solutions sourced from the AoPS forums, originally released in the OpenMathReasoning dataset. The dataset is formatted for use in NeMo Gym and includes only problems for which an answer was extracted. It was authored by NVIDIA and last updated on the Hugging Face platform in January 2026.
tsne is an R package implementing the t-Distributed Stochastic Neighbor Embedding algorithm. The dataset likely contains code or examples for performing dimensionality reduction. It was authored by Justin Donaldson.
ISLR is the companion dataset for the textbook 'An Introduction to Statistical Learning with Applications in R'. The data was compiled by authors Gareth James, Daniela Witten, Trevor Hastie, and Rob Tibshirani to illustrate statistical and machine learning concepts. The specific size, format, and column details are not provided in the metadata.
A dataset associated with a method for applying Bayesian regularization to feed-forward neural networks. The dataset was authored by Paulino Perez Rodriguez and is hosted on Papers with Code. The specific content, size, and structure of the data are not detailed in the available metadata.
A demonstration dataset likely contains exact and numerical solutions for ordinary differential equations. The data appears to be generated from a simulation tool where users can vary initial values, step counts, and numerical methods. The source is the paperswithcode platform, but the original author and update date are unknown.
Statistics Experiment 1 is a dataset authored by Wouter Willaert and published on PapersWithCode. The dataset likely contains statistical analysis results from an experiment. Metadata is minimal; the specific content, scale, and structure require verification after download.
Douglas M. Bates authored the minqa package, which implements derivative-free optimization algorithms based on quadratic approximation. The dataset likely contains algorithmic details, performance metrics, or test results related to these optimization methods. It is sourced from the paperswithcode platform, which aggregates research code and related resources.
simsem provides a framework for generating simulated data for Monte Carlo studies in structural equation modeling. The dataset is authored by Sunthud Pornprasertmanit and is hosted on the paperswithcode platform. Its specific size, row count, and last update date are not provided.
A dataset from the paperswithcode platform related to chemometrics, a field combining chemistry and statistics. It is associated with author Peter Filzmoser. The dataset's specific size, format, and variables are not detailed in the provided metadata.
VICIdial Optimization Results data contains performance metrics from a real-world VICIdial/Asterisk telephony system. The dataset's specific volume, creator, and update date are not provided in the input information.
Aggregating verified computational data for eigenvalues, eigenvectors, and the approximation of the Gauss F_n function. It was authored by Isaia Nisoli and is hosted by Harvard Dataverse.
CO-Bench is a benchmark suite featuring 36 real-world Combinatorial Optimization problems drawn from a broad range of domains and complexity levels. The dataset contains the data for the paper 'CO-Bench: Benchmarking Language Model Agents in Algorithm Search for Combinatorial Optimization'. The dataset page was last updated on 2026-01-12.
Code to replicate the numerical results for the article 'Rational Inattention during an RCT'. It was authored by Bartosz Mackowiak and Mirko Wiederholt to generate Figure 1 from the paper.
Code to replicate the numerical results for the article 'Rational Inattention during an RCT'. It was authored by Bartosz Mackowiak and Mirko Wiederholt to generate Figure 1 from the paper.
NVIDIA released Nemotron-Math-Proofs-v1 in early 2025, providing a large-scale collection of approximately 580,000 natural language proof problems. The dataset includes 550,000 formalizations into Lean 4 theorem statements and 900,000 model-generated reasoning traces for mathematical distillation.
A Data Management and Sharing Plan outlining the strategy for managing and sharing scientific data generated for research on statistical issues in AIDS. The plan was authored by Danyu Lin and was last updated in February 2026.
Bavaria's lithogeochemical map at 1:25,000 scale delineates 184 geochemical rock units derived from geological formations. It provides statistical parameters like the 50th and 90th percentile for element concentrations measured by XRF and ICP-MS. The data represents the current state of knowledge from the Bavarian State Office for the Environment.
Weaving Patterns 7 is a dataset of weaving patterns, which are size nĆ(nā1) matrices with {1, 2, ā¦, n} entries. These patterns were introduced to study the number of reduced decompositions of the longest permutation up to commutation equivalence. The dataset is hosted by ACDRepo on Hugging Face and was last updated on 2026-01-15.
Telephony data from a VICIdial or Asterisk call center system, focused on the initial two-week operational period. The dataset's origin, size, and specific update date are unknown. It appears to be raw data exported from the platform for analysis.