DataSalon

Discover quality datasets for AI training — aggregated from 40+ platforms, curated by AI.

ProductSearch Datasets Browse Topics Rankings Community API / MCP

ResourcesDocumentation Blog Changelog Status

LegalPrivacy Policy Terms of Service Cookie Policy

Speech & Audio Datasets | DataSalon

All Categories

🎤

Speech & Audio

Speech recognition, text-to-speech, speaker identification, music classification, audio event detection

2,587 datasets

Speech & Audio

Ortho-rectified Aerial Mosaic of Coastal Maine from 2011

Maine's coastline from Cutts Island to Prouts Neck is covered by ortho-rectified mosaic tiles. The National Oceanic and Atmospheric Administration (NOAA) produced this data from imagery acquired June 5-7, 2011, using an Applanix Digital Sensor System (DSS). The final mosaic is derived from higher-resolution original aerial photographs.

ImageGeospatial🌎 North AmericaAircraftAerialNoaaMaineDocnoaanosngsNational Geodetic SurveyPhotographInfrared WavelengthsMosaicCoastal MappingAerial ImageryCamerasCoastalNational Ocean ServiceEarth ScienceNgs ImageryRectified ImageInfrared ImageryContinentDigital OrthophotographyOrthophoto+1

0 views

Speech & Audio

NOAA Ortho-rectified Aerial Mosaic of Cape Cod Ports from 2011

NOAA's Integrated Ocean and Coastal Mapping initiative produced this ortho-rectified image mosaic. The source aerial photographs were captured with an Applanix Digital Sensor System between June and September 2011. The final mosaic covers ports in the Cape Cod region of Massachusetts.

ImageGeospatial🌎 North AmericaAircraftMassachusettsAerialNoaaDocnoaanosngsNational Geodetic SurveyPhotographInfrared WavelengthsMosaicCoastal MappingAerial ImageryCamerasCoastalNational Ocean ServiceEarth ScienceNgs ImageryRectified ImageInfrared ImageryContinentDigital OrthophotographyOrthophoto+1

0 views

Speech & Audio

NOAA Ortho-Rectified Aerial Mosaic of Coastal Maine, 2011

Coastal Maine from Cutts Island to Prouts Neck is covered by ortho-rectified mosaic tiles from the NOAA Integrated Ocean and Coastal Mapping initiative. The source imagery was acquired on June 7, 2011, using an Applanix Digital Sensor System (DSS). The final ortho-rectified product is derived from higher-resolution original images.

0 views

Speech & Audio

NOAA Ortho-rectified Aerial Mosaic of Massachusetts Coastal Ports, 2011

NOAA's 2011 ortho-rectified mosaic tiles were created under the Integrated Ocean and Coastal Mapping initiative. The source imagery was acquired from June to September 2011 using an Applanix Digital Sensor System. The original aerial photographs were captured at a higher resolution than the final mosaic product.

ImageGeospatial🌎 North AmericaAircraftMassachusettsAerialNoaaCoastal ImageryDocnoaanosngsNational Geodetic SurveyOrthophoto mosaicPhotographInfrared WavelengthsMosaicAerial PhotographyCamerasCoastalNational Ocean ServiceEarth ScienceNgs ImageryRectified ImageInfrared ImageryContinentDigital OrthophotographyOrthophoto+1

0 views

Speech & Audio

Ortho-Rectified Aerial Mosaic of Buzzards Bay, Massachusetts from 2009

NOAA NGS ortho-rectified mosaic tiles created from imagery acquired between August 10 and October 21, 2009. The National Oceanic and Atmospheric Administration produced this data through its Integrated Ocean and Coastal Mapping initiative using an Applanix Digital Sensor System. The source imagery was acquired at a higher resolution than the final mosaic product.

ImageGeospatial🌎 North AmericaAircraftMassachusettsAerialNoaaCoastal ImageryDocnoaanosngsNational Geodetic SurveyPhotographInfrared WavelengthsMosaicAerial PhotographyCamerasCoastalNational Ocean ServiceEarth ScienceNgs ImageryRectified ImageInfrared ImageryContinentDigital OrthophotographyOrthophoto+1

0 views

Speech & Audio

NOAA Ortho-rectified Aerial Mosaic of Coastal Maine, 2011

Ortho-rectified mosaic tiles were created from aerial imagery acquired on June 7, 2011, using an Applanix Digital Sensor System (DSS). This data product is part of the NOAA Integrated Ocean and Coastal Mapping initiative, covering the Maine coastline from Cutts Island to Prouts Neck. The source images were acquired at a higher resolution than the final ortho-rectified mosaic.

ImageGeospatial🌎 North AmericaAircraftAerialNoaaCoastal ImageryMaineDocnoaanosngsNational Geodetic SurveyPhotographInfrared WavelengthsMosaicAerial PhotographyCamerasCoastalNational Ocean ServiceEarth ScienceNgs ImageryRectified ImageInfrared ImageryContinentDigital OrthophotographyOrthophoto+1

0 views

Speech & Audio

NOAA Near-Infrared Aerial Mosaic of Coastal Maine from 2011

Ortho-rectified mosaic tiles created by NOAA's Integrated Ocean and Coastal Mapping initiative. The source aerial imagery was captured from June 5 to June 7, 2011, using an Applanix Digital Sensor System. The final product is derived from higher-resolution original images.

0 views

Speech & Audio

Ortho-rectified Aerial Mosaic of New Bedford, Massachusetts from 2011

New Bedford, Massachusetts is covered by ortho-rectified mosaic tiles produced by the NOAA Integrated Ocean and Coastal Mapping initiative. The source imagery was acquired on October 5, 2011, using an Applanix Digital Sensor System aircraft. The final mosaic is derived from higher-resolution original images.

ImageGeospatial🌎 North AmericaAircraftMassachusettsAerialNoaaDocnoaanosngsNational Geodetic SurveyOrthophoto mosaicPhotographInfrared WavelengthsNew BedfordMosaicCoastal MappingAerial ImageryCamerasCoastalNational Ocean ServiceEarth ScienceNgs ImageryRectified ImageInfrared ImageryContinentDigital OrthophotographyOrthophoto+1

0 views

Speech & Audio

Danish ASR Unified: 3.5 Million Speech Samples from 7 Sources

A unified Danish speech recognition dataset combines approximately 3.5 million audio samples from seven distinct sources, totaling roughly 16,000 hours of speech. The collection includes European and Danish Parliament recordings, read-aloud and conversational speech, broadcast media, and crowd-sourced samples. It was created by syvai and last updated on the Hugging Face platform in April 2026.

AudioAudio DatasetParliament SpeechBroadcast MediaDanish LanguageSpeech Recognition+1

0 views

Speech & Audio

Google Waxal ASR Challenge: Speech Recognition Competition Data

Google Waxal ASR Challenge data likely contains audio recordings and transcriptions for automatic speech recognition benchmarking. The dataset is hosted on Kaggle, a platform for data science competitions. Its specific size, collection method, and time range are not detailed in the available metadata.

AudioMachine LearningChallenge DatasetAudio ProcessingSpeech Recognition+1

0 views

Speech & Audio

DMSP OLS Global Cloud and Nighttime Imagery Archive

DMSP OLS satellite data provides visible and infrared imagery for monitoring global cloud distribution and cloud top temperatures twice daily. The archive includes low-resolution global and high-resolution regional imagery from a 3,000 km scan, alongside satellite ephemeris and solar/lunar information. Data is sourced from the DMSP Operational Linescan System instruments and archived by NOAA NCEI.

ImageGeospatialNighttime LightsSatellite ImageryEarth ObservationClimate DataCloud Cover+1

0 views

Speech & Audio

Canada and Saint Kitts and Nevis Social Security Agreement

A bilateral social security agreement and administrative arrangement between Canada and the Federation of Saint Kitts and Nevis. The agreement coordinates the two countries' social security systems for individuals who have lived or worked in both jurisdictions. It was published by Global Affairs Canada and is archived as of February 2026, indicating it is out of date and for research purposes only.

Text🇨🇦 CanadaSocial SecuritySaint Kitts NevisInternational CoordinationBilateral Agreements+1

0 views

Speech & Audio

LJspeech: Speech Audio Dataset for Text-to-Speech

A speech dataset likely intended for text-to-speech research, hosted on Kaggle. The dataset's author, organization, and specific content details are not provided in the metadata. Its original creation date and update history are unknown.

AudioText To SpeechAudio DatasetSpeech Synthesis+1

0 views

Speech & Audio

Visual Wake Words - Corrupted (VWW-C): Images with Synthetic Perturbations

Visual Wake Words - Corrupted (VWW-C) is a dataset derived from the Visual Wake Words benchmark, likely containing images with synthetic corruptions. Published on Kaggle, its specific size, license, and authorship details are not provided. Columns and sample data are unknown, suggesting metadata is minimal.

ImageComputer VisionImage ClassificationModel RobustnessCorrupted Data+1

0 views

Speech & Audio

Taiwanese Speech Utterances for Elder-Care Intent Classification

TaigiSpeech is a spoken language understanding dataset containing over 3,000 Taiwanese speech utterances from 21 speakers. Each utterance is labeled with one of 8 intent classes, designed for elder-care and smart-home voice command scenarios to support research in a low-resource language.

AUDIOFOLDERSize Categories1 Kn10 KIntent ClassificationSpoken Language UnderstandingModalitytextTaiwaneseLibrarymlcroissantTask Categoriesaudio ClassificationLibrarydatasetsLicensecc By 40RegionusArxiv260321478Taigi+1

0 views

Speech & Audio

Tritia Gibbosula Shell Surface Preservation Counts from El Mnasra Cave

A 5.5 KB tabular dataset documents the surface preservation condition of Tritia cf. gibbosula mollusk shells from archaeological unit US 8 at El Mnasra cave. The dataset, authored by Emilie Campmas and shared under CC BY 4.0, provides a taphonomic record for paleontological and archaeological analysis.

TabularExcelMollusksTaphonomyArchaeologyPaleontology+1

0 views

Speech & Audio

Tunisian Arabic Speech Recognition Processed Audio Files

Processed audio files for Tunisian Arabic Automatic Speech Recognition (ASR). The dataset is hosted on Kaggle, but its size, creation date, and author are unknown. The title suggests it contains audio data that has been processed for use in speech recognition tasks.

AudioTunisian ArabicAudio ProcessingSpeech Recognition+1

0 views

Speech & Audio

ASR_uaspeechdata: Speech Audio Data for Automatic Speech Recognition

ASR_uaspeechdata is a dataset published on Kaggle. The title suggests it contains audio data likely intended for training or evaluating automatic speech recognition systems. The dataset's specific content, size, and origin are not detailed in the available metadata.

AudioAudio DataUaspeechSpeech Recognition+1

0 views

Speech & Audio

Slakh2100: 2,100 Synthesized Multi-Track Music Pieces with Stems

Slakh2100 is a large-scale dataset containing 2,100 automatically mixed music tracks with isolated instrument stems and aligned MIDI data. Created by Manilow et al. in 2019 at Northwestern University, it is designed for music information retrieval and source separation research. The dataset is hosted by schism-audio on Hugging Face.

AudioMultimodalSynthesized MusicMulti Track AudioMidiLarge Scale+1

0 views

Speech & Audio

Synthetic Duplex Speech Conversations For Real-Time Dialogue Training

104,478 fully synthetic duplex conversations provide 2,133 hours of 16kHz audio for training real-time spoken dialogue models. The dataset was created by author mailong225 for the RelayS2S hybrid architecture, converting text dialogues to speech. It was last updated on March 25, 2026.

Arxiv260323346Regionus+1

0 views

PreviousPage 37 of 130Next