DataSalon

Discover quality datasets for AI training — aggregated from 40+ platforms, curated by AI.

ProductSearch Datasets Browse Topics Rankings Community API / MCP

ResourcesDocumentation Blog Changelog Status

LegalPrivacy Policy Terms of Service Cookie Policy

Mathematics & Statistics Datasets | DataSalon

All Categories

📐

Mathematics & Statistics

Mathematical datasets, statistical benchmarks, probability, optimization, operations research

2,485 datasets

MESS: A Collection of Statistical Functions for R

MESS is a mixed collection of statistical functions, some of which are referenced in Claus Thorn Ekstrøm's book, The R Primer. The collection includes useful and semi-useful scripts for statistical analysis. The dataset's author is Claus Thorn Ekstrøm, and it is hosted on the paperswithcode platform.

TabularHistoryProgramming LanguageScripting LanguageComputer ScienceScript CollectionR ProgrammingStatisticsArt+1

0 views

Mathematics & Statistics

Mixture and Flexible Discriminant Analysis from Elements of Statistical Learning

Mixture and flexible discriminant analysis data, likely used to illustrate methods from the seminal textbook 'Elements of Statistical Learning'. The dataset is associated with authors Trevor Hastie, Robert Tibshirani, and Jerome Friedman. It is referenced in the context of multivariate adaptive regression splines (MARS), BRUTO, and vector-response smoothing splines.

TabularDiscriminantComputer ScienceMathematicsPattern RecognitionPattern Recognition PsychologyStatistical LearningDiscriminant AnalysisArtificial IntelligenceMultivariate AnalysisLinear discriminant analysis+1

0 views

Mathematics & Statistics

Shinystan: Interactive Diagnostics and Analysis for Bayesian Model Output

Shinystan provides a graphical user interface for Markov chain Monte Carlo diagnostics and posterior sample analysis. The tool, created by Jonah Gabry, is powered by the Shiny framework and works with output from MCMC programs in any language, with extended support for Stan models via rstan and rstanarm packages.

TabularVisual AnalyticsBayesian ProbabilityBayesian StatisticsComputer SciencePosterior ProbabilityPosterior AnalysisInteractive Visual AnalysisVisualizationArtificial IntelligenceMcmc Diagnostics+1

0 views

Mathematics & Statistics

pwrss: Statistical Power and Sample Size Calculation Tools for Hypothesis Tests

The 'pwrss' R package provides functions for statistical power and minimum required sample size calculations. It is authored by Metin Buluş and is designed for a wide range of commonly used hypothesis tests in psychological, biomedical, and social sciences. The dataset's size, row count, and last update date are unknown.

TabularHypothesis TestingStatistical PowerSample MaterialEconometricsComputer ScienceData ScienceMathematicsPsychologyBiomedical SciencesChemistryPower PhysicsPhysicsStatisticsChromatographySample SizeSample Size Determination+1

0 views

Mathematics & Statistics

ISLR2: Datasets for Statistical Learning Textbook Examples

ISLR2 is a collection of datasets used in the textbook 'An Introduction to Statistical Learning with Applications in R, Second Edition'. The collection includes datasets from the first edition, some with minor changes, and some new datasets. The data was compiled by author Gareth James for educational purposes.

TabularMachine LearningR LanguageComputer SciencePsychologyEducationStatistical LearningArtificial IntelligenceMathematics Education+1

0 views

Mathematics & Statistics

easystats: R Framework for Statistical Modeling, Visualization, and Reporting

Daniel Lüdecke's easystats is a meta-package providing a unifying framework for statistical analysis in R. It bundles multiple packages to offer consistent modeling, visualization, and reporting workflows. The collection includes teaching articles for instructors and a dashboard for new users to access summaries and visualizations with minimal programming.

TabularComputer ScienceData VisualizationData ScienceVisualizationR PackagesStatistical ModelingTeaching ResourcesData Mining+1

0 views

Mathematics & Statistics

Patchwork: Plot Composition Operators for R

The 'patchwork' package by Thomas Lin Pedersen extends the 'ggplot2' API to compose multiple plots. It provides mathematical operators for combining plots, addressing a need also targeted by packages like 'gridExtra' and 'cowplot'. The dataset likely contains examples or metadata related to this plot composition functionality.

TabularR PackageData VisualizationArtGgplot2Plot Composition+1

0 views

Mathematics & Statistics

Explanatory Combinatorial Dictionary: A Formalized Semantically-Based Lexicon

Explanatory Combinatorial Dictionary is a formalized, semantically-based lexicon designed as part of a linguistic model of natural language. The paper describes its main properties, the structure of a lexical entry, groupings of entries, and principles for compilation, illustrated with a series of entries for an English ECD. The platform indicates it is related to lexicography, combinatorics, and natural language processing.

TextLexicographyLexicographical OrderCombinatoricsEpistemologyComputer ScienceMathematicsMeaning ExistentialArtificial IntelligencePhilosophyLexiconNatural Language ProcessingCharacter MathematicsLinguisticsRule Based System+1

0 views

Mathematics & Statistics

DRP For Perovskite Solar Cells: A Simulation-Based Dataset for ML Optimization

A simulation-based dataset for machine learning-driven optimization of perovskite solar cells. The dataset is described as large-scale, suggesting it contains a substantial number of simulated material or device configurations. It was sourced from Kaggle, but specific authorship, creation date, and exact size are not provided.

TabularSimulation DataPerovskite Solar CellsMachine Learning OptimizationLarge ScaleMaterials Science+1

0 views

Mathematics & Statistics

U.S. Metropolitan Area County Classifications for FY 2009

Metropolitan Area Look-Up is a system from the U.S. Department of Housing and Urban Development that allows users to determine if a selected county is part of an OMB-defined Core Based Statistical Area. It provides a mapping of state and county combinations to their FY 2009 CBSA status.

Metropolitan Area CountyCbsaGeography AssistantCore Based Statistical Area+1

0 views

Mathematics & Statistics

Latin American Economic History in the 1930s: Country Case Studies

A collection of academic papers analyzing the economic impact of the 1930s Great Depression across Latin America. The work, edited by Rosemary Thorp, includes case studies on Argentina, Brazil, Chile, Colombia, Mexico, Peru, and Central America. The analysis covers topics such as the shift from export-led to import-substituting economies, the role of state policy, and international economic pressures.

TextHistoryIndex TypographyPolitical EconomyDepression EconomicsCapital ArchitectureIndustrialisationArchaeologyState Computer ScienceLawLatin AmericaGreat DepressionEconomicsLatin AmericansEconomyGeographyFinanceEconomic HistoryKeynesian EconomicsPolitical ScienceWorld EconomyRestructuring+1

0 views

Mathematics & Statistics

Student Performance Statistics for Educational Analysis

Statistical data on student performance, likely containing metrics related to academic outcomes. The dataset is hosted on Kaggle, a platform for data science projects. Details regarding its specific source, size, and creation date are not provided in the available metadata.

TabularEducation StatisticsStudent PerformanceAcademic Analysis+1

0 views

Mathematics & Statistics

Arsenal: An R Package for Large-Scale Statistical Summary Tables

An R package by Ethan Heinzen providing functions for generating large-scale statistical summaries. The toolkit includes functions for creating Table-1-like summaries, frequency tables, model summaries, and data frame comparisons, designed to integrate with R and RStudio reporting tools. Its primary functions are tableby(), paired(), modelsum(), freqlist(), comparedf(), and write2().

TabularCartographyR PackageComputer ScienceData ScienceResearch ToolsScale RatioGeographyLarge ScaleData AnalysisStatistical Summaries+1

0 views

Mathematics & Statistics

asbio: Statistical Tools for Biologists

A collection of statistical tools for biologists, authored by Ken Aho. The description mentions parameters for parent distributions including normal, t, exponential, and uniform. The dataset likely contains statistical functions or parameters for biological analysis.

TabularComputer ScienceMathematicsBiologyStatistics+1

0 views

Mathematics & Statistics

United States Harmonized Tariff Schedule for 2026

2026 data provides the official tariff rates and statistical categories for all merchandise imported into the United States, based on the international Harmonized System. It is maintained by the US International Trade Commission and includes all revisions for the current year.

Product ClassificationHtsTariff Schedule+1

0 views

Mathematics & Statistics

Vietnamese Elementary Math Problems with Images

A collection of elementary-level math problems presented in Vietnamese, likely containing both textual questions and illustrative images. The dataset is hosted on Kaggle, but details on its size, creation date, and authorship are currently unknown. Columns and specific content require verification after download.

MultimodalImage Text PairsElementary MathVietnamese Language+1

0 views

Mathematics & Statistics

bsts: Bayesian Structural Time Series for Regression

Bayesian Structural Time Series models for regression, fit using Markov Chain Monte Carlo methods. The methodology is described in the 2014 paper by Scott and Varian. The dataset's specific size, columns, and temporal coverage are not detailed in the provided metadata.

Time SeriesBayesian ProbabilityBayesian StatisticsComputer ScienceGeologyMathematicsStructural ModelsRegressionArtificial IntelligenceSeries StratigraphyStatisticsMcmc+1

0 views

Mathematics & Statistics

LaplacesDemon: Bayesian Inference Environment with Multiple Samplers

LaplacesDemon is a software environment for Bayesian inference created by Byron Hall. The description mentions it provides a variety of different statistical samplers for performing inference. The specific data format, size, and update frequency are not detailed in the provided metadata.

TabularBayesian InferenceInferenceBayesian ProbabilityComputer ScienceArtificial IntelligenceStatistical Sampling+1

0 views

Mathematics & Statistics

astrochron: Computational Routines for Astrochronology and Paleoclimate Analysis

Routines for astrochronologic testing, astronomical time scale construction, and time series analysis, as described in a 2018 paper by Stephen Meyers. The tool also includes a range of statistical analysis and modeling routines relevant to time scale development and paleoclimate analysis.

Time SeriesComputer ScienceTime Series AnalysisComputational GeologyAstrochronologyPaleoclimate+1

0 views

Mathematics & Statistics

Values of the Links-Gould Polynomial: Supplement to arXiv:2509.16868

Matthew Harper published this mathematical dataset in 2026 to provide computed values of the Links-Gould polynomial. While the total record count is not specified, the data serves as a computational supplement to the research findings in arXiv:2509.16868 regarding quantum invariants in knot theory.

Mathematical Sciences+1

0 views

PreviousPage 96 of 124Next