DataSalon

Discover quality datasets for AI training — aggregated from 40+ platforms, curated by AI.

ProductSearch Datasets Browse Topics Rankings Community API / MCP

ResourcesDocumentation Blog Changelog Status

LegalPrivacy Policy Terms of Service Cookie Policy

Machine Learning Datasets | DataSalon

All Categories

🤖

Machine Learning

General ML benchmarks, tabular data, AutoML, recommendation systems, anomaly detection, evaluation suites

193,236 datasets

Machine Learning

CORONA Satellite Image of Afon Tawe to Talgarth, Wales

A panchromatic CORONA satellite image covering an area from Afon (River) Tawe to Talgarth in Wales. The ground resolution is approximately 2.75 metres, and the dataset includes four processed image segments. The positional accuracy is noted as unreliable and may require re-georeferencing.

ImageGeospatialLandscape analysisPanchromaticSatellite ImageryComputer VisionGeography+1

0 views

Machine Learning

Gypsy and Traveller Site Boundaries in Lambeth

Lambeth, a borough in London, contains this dataset showing the boundaries of sites defined as Gypsy and Traveller sites on the 2015 Lambeth planning proposals map. The data likely contains polygon features representing these designated areas for planning purposes. It is available in multiple geospatial formats, including KML, GeoJSON, and ESRI REST services.

GeospatialZIPCSVLand Use PlanningGeospatial BoundariesLambethGypsy Traveller Sites+1

0 views

Machine Learning

District Centre Boundaries for London's Lambeth Borough

Lambeth borough contains two major centres, Brixton and Streatham, whose official boundaries are defined in this dataset. The data originates from the London Plan and Lambeth's Core Strategy, providing a geospatial layer for urban planning. It is available on multiple open data platforms, indicating its established use for public policy.

GeospatialZIPCSVLondonAdministrative BoundariesLambethUrban Planning+1

0 views

Machine Learning

Seagrass Coverage Changes in Cockburn Sound, Western Australia, 1967-1999

Changes in seagrass coverage in Cockburn Sound, Western Australia, were assessed from 1967 to 1999. The dataset was created by analyzing aerial photographs using modern mapping methods to determine the magnitude of change in hectares. It is hosted by the Australian Ocean Data Network and was last updated on 2026-07-22.

AudioGeospatialZIPSeagrass CoverageEnvironmental ChangeAerial PhotographyCoastal Ecology+1

0 views

Machine Learning

OSNI Open Data: 1:1,000,000 Raster Map of Northern Ireland

OSNI Open Data provides a 1:1,000,000 scale raster map of Northern Ireland with place names. This static image product is published for OpenData and is suitable for use as background mapping. The dataset was last updated on 2026-07-08.

ImageGeospatialJSONNorthern IrelandRaster MapsComputer VisionLarge ScaleBackground Mapping+1

0 views

Machine Learning

OSNI Open Data: 1:1,000,000 Raster Map of Northern Ireland with Place Names

Northern Ireland is covered by a 1:1,000,000 scale raster map published by OpenDataNI. The dataset is a static image suitable for background mapping, providing an overview of the region. It was last updated on 2026-07-08 and is available under the OGL-UK-3.0 license.

GeospatialJSONPlace NamesRaster MapNorthern IrelandComputer VisionLarge Scale+1

0 views

Machine Learning

OSNI Open Data: 1:1,000,000 Raster Map of Northern Ireland

OSNI Open Data provides a 1:1,000,000 scale raster map of Northern Ireland with place names, published by OpenDataNI. This raster product is the smallest scale from OSNI, offering a broad overview of the region. The dataset was last updated on 2026-07-08 and is available under the OGL-UK-3.0 license.

ImageGeospatialJSONPlace NamesNorthern IrelandRaster MapsComputer VisionLarge ScaleGeospatial Data+1

0 views

Machine Learning

MOPITT Satellite Daily Gridded Carbon Monoxide Beta Data

MOP03N_109 provides daily mean-gridded carbon monoxide (CO) profile and total column retrievals from near-infrared radiances measured by the MOPITT instrument aboard NASA's Terra satellite. This is a non-validated beta product subject to recalibration, containing gridded averaging kernels alongside the retrievals. Data collection is ongoing, originating from an instrument launched in 1999 and funded by the Canadian Space Agency.

Time SeriesGeospatialATMOSPHERIC CHEMISTRYCarbon MonoxideAir QualitySatellite DataEarth Observation+1

0 views

Machine Learning

Brisbane City Council Park Locations and Acquisition Details

Brisbane City Council maintains more than 2180 parks across the city, ranging from small pocket parks to large district parks and bushland reserves. This dataset identifies park locations and notes they are acquired through processes like resumptions, direct purchase, or as donated assets. The data is provided by Brisbane City Council on its open data platform.

GeospatialAssetsOpen SpaceRecreationUrban PlanningPark NumberGreen SpaceEnvironmentParksReserve+1

0 views

Machine Learning

Information Requests to Land & Property Services: 2020/2021 Statistics

2020/2021 statistics detail information requests received by Land & Property Services (LPS). The dataset likely contains counts and types of requests, including Freedom of Information (FOI), Environmental Information Regulations (EIR), and Subject Access Requests (SAR). It is published under an open government license, indicating its use for public accountability and transparency analysis.

TabularCSVEnvironmental RegulationsFreedom Of InformationGovernment ServicesInformation RequestsLand PropertyPublic SectorSubject AccessEnvironmental Regulations RequestSubject Access Request+1

0 views

Machine Learning

Benchmark of 80 Machine Learning Models for Cardiovascular Risk Prediction

80 machine learning models were benchmarked for cardiovascular risk prediction using the Kaggle Cardiovascular Disease dataset. The results, ordered by AUC-ROC, compare four base architectures across 20 distinct methodological experiments. The dataset was published by 'Predicción Cardiovascular' on figshare in June 2026.

TabularTime SeriesCSVMachine Learning BenchmarkModel PerformanceBenchmarkHealthcareCardiovascular RiskHealthcare Ai+1

0 views

Machine Learning

IND221: Australian Labour Supply and Automation Initiatives

IND221 tracks the presence of government programs aimed at mitigating labour shortages. The metric indicates whether initiatives exist to attract workers, improve workforce capacity, or introduce automation technologies. This dataset is provided by the Australian Ocean Data Network and was last updated in July 2026.

TabularWorkforce PolicyGovernment MetricsAutomation InitiativesLabour Supply+1

0 views

Machine Learning

Freshwater Atlas Rivers: Double-Line Polygons for British Columbia

River polygons representing double-line features for the province of British Columbia. The dataset is published by the Government of British Columbia on the open_canada platform under an OGL-CA-2.0 license. It was last updated on 2026-07-22 18:33:30.944393.

Geospatial🇨🇦 CanadaFreshwaterHydrologyRiver Polygons+1

0 views

Machine Learning

Freshwater Atlas Obstructions for British Columbia

Geospatial data on water obstacles such as rapids and falls. The dataset is published by the Government of British Columbia on the open_canada platform under an OGL-CA-2.0 license. It was last updated on 2026-07-22.

Geospatial🇨🇦 CanadaFreshwaterHydrologyObstructions+1

0 views

Machine Learning

Freshwater Atlas Wetlands: Polygons for British Columbia

All wetland polygons for the province of British Columbia, Canada. The data is published by the Government of British Columbia on the open_canada platform. It was last updated on 2026-07-22 18:33:02.479726.

Geospatial🇨🇦 CanadaFreshwaterEnvironmental scienceWetlands+1

0 views

Machine Learning

Freshwater Atlas Named Watersheds of British Columbia

Freshwater Atlas - Named Watersheds is a geospatial dataset from the Government of British Columbia. It contains polygon data for all named watersheds in the province. The dataset was last updated on 2026-07-22 and is published under the OGL-CA-2.0 license.

Geospatial🇨🇦 CanadaFreshwaterHydrologyWatersheds+1

0 views

Machine Learning

Freshwater Atlas Watershed Type Code Lookup Table

A lookup table for watershed type codes, published by the Government of British Columbia. The dataset is available in HTML and PDF formats under the OGL-CA-2.0 license. Its last update was recorded on 2026-07-22 18:32:02.589651.

TabularFreshwaterCanada EnvironmentWatershedsGeospatial Codes+1

0 views

Machine Learning

Freshwater Atlas Lakes of British Columbia

British Columbia's Freshwater Atlas provides polygon data for all lakes in the province. The dataset is published by the Government of British Columbia on the open_canada platform. It was last updated on 2026-07-22.

Geospatial🇨🇦 CanadaFreshwaterLakesEnvironmental+1

0 views

Machine Learning

Chrysoprase Gemstone Colorimetric Data for Machine Learning Color Grading

51 chrysoprase gemstone samples and 676 synthetic green reference points measured for CIE L*a*b* color values using an X-Rite SP62 spectrophotometer. Author Yuansheng Jiang published this dataset on figshare under a CC-BY-4.0 license, last updated on 2026-05-15. The data was used to train and validate machine learning models, including logistic regression and neural networks, for objective gemstone color grading.

TabularExcelMachine LearningColorimetryBenchmarkGemstone ColorMaterials ScienceSynthetic+1

0 views

Machine Learning

WikiArtPlus: Multi-Relational Art Dataset with Wikipedia Explanations

WikiArt+ is a multi-relational multimodal benchmark for art understanding, extending the original WikiArt dataset. It contains 22,028 images and 29,603 texts connected through 308,050 typed semantic edges. The dataset was curated by Antonio Purificato and is hosted on Hugging Face.

ImageTextTabularGraphMultimodalWikiartArt UnderstandingBenchmarkNatural Language ProcessingSemantic GraphArtCultural HeritageMultimodal Benchmark+1

0 views

PreviousPage 37 of 9634Next