DataSalon

Discover quality datasets for AI training — aggregated from 40+ platforms, curated by AI.

ProductSearch Datasets Browse Topics Rankings Community API / MCP

ResourcesDocumentation Blog Changelog Status

LegalPrivacy Policy Terms of Service Cookie Policy

Mathematics & Statistics Datasets | DataSalon

All Categories

📐

Mathematics & Statistics

Mathematical datasets, statistical benchmarks, probability, optimization, operations research

2,489 datasets

New York State Personal Income Tax Statistics by Residence

Annual statistical information from timely filed New York State personal income tax returns, beginning with tax year 1999. The dataset provides major tax structure components like income, deductions, and tax liability, broken down by size of income and filer's permanent place of residence. It is produced by the New York State Department of Taxation and Finance.

TabularTime SeriesCSVXMLJSONIncomeCountyTf Personal Income TaxIncome DistributionNew York StateTaxTaxationLiabilityPublic Finance+1

0 views

Mathematics & Statistics

Compiler Autotuning Data via Bayesian Networks

COBAYN is a framework for automatically tuning compiler optimization parameters using Bayesian networks. The project was created by amirjamez and last updated in May 2022. It was developed as part of the European Union's Antarex project.

TabularMachine LearningBayesian NetworksAutotuningAutomatic TuningCompiler OptimizationsPerformance TuningEu ProjectAntarexCompiler OptimizationCompilers+1

0 views

Mathematics & Statistics

Nièvre Watercourses Subject to French Environmental Code Classification

The Dataset Direct Download Service (WFS): Watercourses subject to the Nièvre Water Act (58) is a geospatial layer from the Bureau de Recherches Géologiques et Minières (BRGM). It classifies watercourses in the Nièvre department of France based on legal criteria from Articles L.214-1 to L.214-6 of the French Environmental Code and case law. The dataset was last updated on February 9, 2022.

Geospatial🇫🇷 FranceEnvironmental LawHydrologyWatercoursesSynthetic+1

0 views

Mathematics & Statistics

Brazil Coastal Land Use Change Data for Spatial Planning Evaluation, 2005-2015

A 2021 scientific paper evaluated the conformance of spatial planning goals with outcomes for urban compactness, services, and nature conservation in São Paulo State, Brazil. The dataset includes rasterized land use and cover variables derived from Landsat 5 and Landsat 8 satellite imagery classified for 2005 and 2015. It was created using Partial Least Squares Path Modelling to analyze the relationship between 2005-2006 planning strategies and land-use change ten years later.

GeospatialSpatial AnalysisLand Use ChangeSatellite ImageryComputer VisionUrban PlanningFinanceSpatial PlanningUrban Compactness+1

0 views

Mathematics & Statistics

Quarterly and Annual Children's Services Statistics from Ukraine

Quarterly and Annual Analytical and Statistical Reporting of Children's Services is a dataset from the States site of Ukraine. The dataset was last updated on February 3, 2022. It likely contains tabular data on children's services, available in XML, CSV, WORD DOC, and EXCEL XLS formats.

TabularCSVXMLAnnual DataUkraineStatistical ReportingQuarterly DataChildren Services+1

0 views

Mathematics & Statistics

Handwritten Mathematical Expressions for LaTeX Conversion

A collection of handwritten mathematical expression images paired with LaTeX code, created by Azu and hosted on Hugging Face. The dataset contains between 10,000 and 100,000 samples, as indicated by its size category, and was last updated in March 2022. It is designed for training models to convert visual mathematical notation into structured text markup.

ImageMultimodalIMAGEFOLDERSize Categories10 Kn100 KModalitytextLibrarymlcroissantModalityimageSymbol RecognitionLibrarydatasetsOptical Character RecognitionHandwritten MathRegionusLatex Conversion+1

0 views

Mathematics & Statistics

Snowpack Heat and Mass Transport Simulation Data via FEniCS

A dataset containing code and output data for simulating one-dimensional heat and mass transport in snow using the FEniCS finite element library. The data was created by ENVIDAT to reproduce a key figure from a 2022 publication in The Cryosphere journal. It includes the solver code and resulting simulation data.

TabularTime SeriesHeat TransferMass TransferCryosphereSnow PhysicsFinite Element Modeling+1

0 views

Mathematics & Statistics

Ising2D

Functioning as titled 'Ising2D' and was authored by yonesuke. It was last updated on January 18, 2022. The dataset's size, row count, column structure, and specific content are unknown.

TextSize Categoriesn1 KModalitytextLibrarymlcroissantLibrarydatasetsRegionus+1

0 views

Mathematics & Statistics

Ukraine Ministry of Digital Information Requests and Reports for 2020

Statistical information on the receipt and handling of public information requests submitted to the Ministry of Digital Information of Ukraine. The dataset covers the year 2020 and was published on the EU Open Data portal in November 2021. It likely contains metrics on request volumes, processing outcomes, and satisfaction rates.

TabularGovernment TransparencyInformation RequestsUkraineStatistics+1

0 views

Mathematics & Statistics

Document Summarization Dataset from AI Hub

Encompassing text data for document summarization tasks, sourced from AI Hub. The dataset size is categorized as between 10,000 and 100,000 entries, and it was last updated in December 2021.

Size Categories10 Kn100 KModalitytextLibrarymlcroissantLibrarydatasetsRegionus+1

0 views

Mathematics & Statistics

Ukrainian Preschool, Secondary, and Vocational Schools List with Statistics

Published on September 20, 2021, this dataset lists educational institutions in Ukraine, including preschools, secondary schools, and vocational schools. It is provided by the States site of Ukraine and likely contains statistical information about these institutions. The data is available in tabular formats such as Excel and CSV.

TabularCSVUkraineEducation InstitutionsStatisticsSchool Directory+1

0 views

Mathematics & Statistics

U.S. State and Equivalent Geographic Boundaries for 2017

2017 geographic boundaries for U.S. states and equivalent entities, extracted from the Census Bureau's MAF/TIGER Database. The dataset includes the fifty states, the District of Columbia, Puerto Rico, and U.S. Island Areas, designed as standalone shapefiles that can be combined for national coverage.

PolygonUnited StatesNation+1

0 views

Mathematics & Statistics

Urban Green Areas in Sweden from Satellite Data, 2010 and 2015

Geospatial data on green areas within Swedish urban agglomerations, produced by Statistics Sweden. The dataset defines green areas as contiguous, publicly available spaces of at least 0.5 hectares, excluding arable land but including pasture. It covers two reference years, 2010 for the 37 largest agglomerations and 2015 for all agglomerations, with data last updated in December 2020.

GeospatialStatistics SwedenUrban Green SpacesSatellite DataLand Use+1

0 views

Mathematics & Statistics

Snow Saltation Simulation Data for Grain Size and Cohesion Effects

Simulation results and code support research on modeling snow saltation, specifically the effects of grain size and interparticle cohesion. The data originates from a study published in the Journal of Geophysical Research: Atmospheres and was contributed by ENVIDAT. The supporting code uses a Large Eddy Simulation flow solver coupled with a Lagrangian Stochastic Model.

TabularLarge Eddy SimulationSnow SaltationGeophysical ModelingLagrangian Stochastic Model+1

0 views

Mathematics & Statistics

Ionospheric Electron Content at Jang Bogo Station in 2020

2020 data records total electron content in the ionosphere over the Jang Bogo Station in Antarctica. The dataset supports the study of statistical characteristics of the ionosphere at southern high latitudes. It was created by AMD_KOPRI and published via NASA EarthData in October 2020.

Time SeriesGeospatialAntarcticaSpace WeatherGeophysicsTotal Electron ContentIonosphere+1

0 views

Mathematics & Statistics

Gestational Diabetes Trial Data from Morocco

A 2020 cluster randomized controlled trial in two Moroccan districts analyzed data from 210 recruited pregnant women to evaluate a primary care intervention for gestational diabetes. The study assessed outcomes including birthweight, maternal weight gain, glucose control, and pregnancy complications. It was conducted by Bettina Utz and registered under NCT02979756.

Maternal HealthGestational Diabetes+1

0 views

Mathematics & Statistics

Conifer Molecular Phylogeny Analysis with Mass-Extinction Detection

A Bayesian method for detecting mass-extinction impacts on molecular phylogenies, developed by Michael R. May and published in 2020. The CoMET model analyzes lineage diversification rates using a compound Poisson process to distinguish background rate variation from extinction events. An empirical application identified a major mass-extinction event in conifers approximately 23 million years ago.

28 Mass ExtinctionLineage Diversification RatesCompound Poisson Process ModelBirth-Death Stochastic-Branching Process+1

0 views

Mathematics & Statistics

Genetic Paternity Analysis of 20 Spectacled Caiman Nests

Microsatellite genotype data from 15 female and 174 hatchling spectacled caimans (Caiman crocodilus) across 20 nests. The dataset was used to investigate mating systems, demonstrating a 95% frequency of multiple paternity over four reproductive seasons from 2007 to 2010 in the Brazilian Amazon.

System Of MatingMulti Year Multiple Paternity AnalysisReproductive strategies and kinship analysisPurus riverConservation genetics and biodiversityCaiman crocodilusHolocene+1

0 views

Mathematics & Statistics

Wheat Allele Introgression into Barbed Goatgrass Populations in Southern Spain

458 specimens of the wild wheat relative Aegilops triuncialis from 31 populations in a 60 km x 20 km area in Southern Spain were genotyped to estimate wheat admixture levels. The data, generated by Mila Pajkovic and published in 2020, includes results from Approximate Bayesian Computation modeling to estimate selfing rates and the magnitude and tempo of wheat allele introgression.

Crop to wild gene flowTransgene EscapeContainmentAutogamy+1

0 views

Mathematics & Statistics

Body Motion Dyadic Modes in Scientific Conversations

Motion data from 12 dyads of scientists recorded during face-to-face conversations using depth-sensing cameras. It was created to analyze dyadic modes and motion motifs, such as synchronized parallel torso motion and still segments. The data supports the study of individuality in motion modes, which was maintained for at least 6 months in a subset of 5 dyads.

Joint ImprovisationSynchrony MotionNonverbal communicationCoordinationJoint Action+1

0 views

PreviousPage 118 of 125Next