DataSalon

Discover quality datasets for AI training — aggregated from 40+ platforms, curated by AI.

ProductSearch Datasets Browse Topics Rankings Community API / MCP

ResourcesDocumentation Blog Changelog Status

LegalPrivacy Policy Terms of Service Cookie Policy

Speech & Audio Datasets | DataSalon

All Categories

🎤

Speech & Audio

Speech recognition, text-to-speech, speaker identification, music classification, audio event detection

2,602 datasets

Speech & Audio

REMOTS Sediment Imagery for Buzzards Bay 2000 Survey

November 2000 REMOTS survey data for Buzzards Bay, Massachusetts, collected for the Massachusetts Coastal Zone Management Agency's Dredged Material Management Plan. The dataset represents the complete analyzed set of imagery from that survey. It was produced by the organization SCIOPS.

GeospatialCoastal managementBuzzards BayMarine SedimentRemots Imagery+1

0 views

Speech & Audio

Scotts Bluff National Monument Water Quality GIS Layers

National Park Service GIS layers compiled for a Baseline Water Quality Data Inventory and Analysis Report for Scotts Bluff National Monument. The data includes locations of water quality monitoring stations, industrial discharges, drinking intakes, gages, and impoundments, sourced from six EPA databases. Base layers such as roads, hydrography, and political boundaries are included, generally at a scale of 1:100,000.

GeospatialHydrologyWaterBenchmarkHydrographyGeringWater QualityNeNational ParkScotts Bluff National Monument+1

0 views

Speech & Audio

EchoTTS Omnivoice En 20K: English Speech Synthesis Dataset

EchoTTS Omnivoice En 20K is a speech synthesis dataset authored by SynDataLab and hosted on Hugging Face. The dataset was last updated on April 15, 2026. Its specific content and scale are not detailed in the available metadata.

AudioText To SpeechSpeech SynthesisVoice CloningAudio Generation+1

0 views

Speech & Audio

Seattle Curb Space Categories with Daily Updates

Curb Space Categories maintained by the Seattle Department of Transportation. The dataset is refreshed daily and includes a feature class labeled SDOT.CURB_SPACES. An update on April 14, 2025 added a new category called 'MVZ' (Music Venue Zone).

AudioGeospatialCurb SpaceSeattle Gis Open DataTransportationUrban PlanningParkingCurb Space CategoriesSdotSeattleCommon Data Layers+1

0 views

Speech & Audio

Primary Spoken Languages for California Insurance Affordability Applicants

California data tracks the primary spoken language of applicants for Insurance Affordability Programs, sourced from the CalHEERS system. The dataset covers 13 specific languages including English, Spanish, Vietnamese, and Cantonese. It supports public reporting requirements under the California Welfare and Institutions Code.

TabularMultilingualZIPCSVHealth InsurancePublic AssistanceCaliforniaUnited StatesDemographics+1

0 views

Speech & Audio

Mental Health Survey for US Elementary Music Educators

A survey dataset examines mental health challenges within elementary music education in the United States. It was created by Hamidreza Niknampour and uploaded to figshare in March 2026. The dataset is 25.0 KB in size, indicating a limited scope.

TabularAudioEnglishExcelMental HealthSurvey DataSurveyHealthcareUnited StatesElementary SchoolMusic Education+1

0 views

Speech & Audio

Bel Canto and Chinese Folk Song Singing Technique Recordings

A dataset created by the authors at ccmusic-database, encompassing two distinct singing styles: bel canto and Chinese folk singing. Bel canto is a vocal technique from Western classical music, while Chinese folk singing represents a separate vocal tradition. The dataset was last updated on 2026-02-27.

AudioChinese Folk SongSinging TechniqueBel CantoAudio Analysis+1

0 views

Speech & Audio

WebAtlasRP: Geospatial Web Map Service Layers from Germany

RP _WebAtlasRP is a collection of geospatial data layers published as a Web Map Service (WMS) by the Bundesamt für Kartographie und Geodäsie. The dataset was last updated on 2026-03-24. Its specific thematic content is not detailed in the available metadata.

Geospatial🇩🇪 GermanyWeb Map ServiceCartography+1

0 views

Speech & Audio

Ghana English Speech: 600 Hours of News Media Audio for ASR

A 600-hour speech dataset of Ghanaian English extracted from Ghanaian news media broadcasts, designed for training Automatic Speech Recognition models on West African accents. The dataset was created by the ghananlpcommunity and was last updated on March 6, 2026. It contains audio segments with verbatim transcriptions and duration metadata.

AudioWest AfricaNews MediaEnglish LanguageSpeech RecognitionAccent Dataset+1

0 views

Speech & Audio

Pantheon 1.0: Historical Popularity Index of Figures from Wikipedia

Pantheon 1.0 measures the global popularity of historical characters using two metrics derived from Wikipedia. The simpler metric (L) counts the number of language editions with an article about a figure, while the Historical Popularity Index (HPI) adjusts for age, page view concentration, and cross-language views. The dataset was developed by the Macro Connections group at the MIT Media Lab.

TabularBiographical DataHistorical PopularityDigital HumanitiesWikipediaCultural Impact+1

0 views

Speech & Audio

GigaSpeech: 10,000 Hours of Multi-Domain English Transcribed Audio

GigaSpeech is a multi-domain English speech recognition corpus containing 10,000 hours of high-quality labeled audio released by SpeechColab in 2021. The data is aggregated from audiobooks, podcasts, and YouTube, capturing a mix of read and spontaneous speaking styles across topics like arts, science, and sports.

ParquetTask Categoriestext To SpeechLibrarypolarsLibrarydaskLanguageenSize Categories10 Mn100 MDoi1057967hf6261ModalitytextLibrarymlcroissantTask Categoriestext To AudioArxiv210606909LibrarydatasetsRegionusTask Categoriesautomatic Speech RecognitionMultilingualitymonolingualLicenseapache 20+1

0 views

Speech & Audio

GigaSpeech ASR Clean: Filtered Speech Recognition Samples

A filtered dataset for automatic speech recognition (ASR) created by OpenSpeechHub. The dataset has been cleaned by removing samples with fewer than three words, repetitive tokens, or chat token leaks. It was last updated on March 31, 2026.

AudioFiltered DataAudio ProcessingSpeech Recognition+1

0 views

Speech & Audio

Espeech Balalaika: 100K-1M Annotated Russian Audiobook Speech Records

MTUCI's lab260 team released this Russian speech corpus in early 2026, containing between 100,000 and 1,000,000 records. The dataset consists of audiobook recordings filtered and annotated using the BALALAIKA pipeline to support advanced generative speech tasks.

ParquetTask Categoriestext To SpeechLibrarypolarsLibrarydaskModalitytextSize Categories100 Kn1 MArxiv250713563ModalitytabularLibrarymlcroissantLibrarydatasetsRegionusTask Categoriesautomatic Speech RecognitionLanguageruLicenseapache 20+1

0 views

Speech & Audio

SleepViz V12 Batch 10-3: Sleep Audio with Music and Silence Augmentation

Weak class augmentation part 3 focuses on music genres and silence noise. The dataset appears to be part of a larger series for sleep visualization or analysis. Its specific scale and creation details are not provided.

AudioAudio AugmentationNoise GenerationSleep AnalysisMusic Genres+1

0 views

Speech & Audio

SleepViz V12 Batch 10-6: Environmental Sounds and Fan Noise

SleepViz V12 Batch 10-6 is an audio dataset focused on environmental sounds and fan noise. The dataset appears to be part of a series for weak class augmentation, suggesting its use in machine learning tasks. Its author, organization, and specific scale are unknown.

AudioAugmentationEnvironmental SoundsFan NoiseSleep Audio+1

0 views

Speech & Audio

MOMA: Broadband Seismometer Array Data from Missouri to Massachusetts

The Missouri to Massachusetts Broadband Seismometer Experiment deployed 18 broadband seismometers between two permanent stations, forming a linear 20-station array spanning 1740 km. The experiment, conducted by SCIOPS, aimed to study the core-mantle boundary, crust, mantle, and subducting slabs beneath the eastern U.S. The dataset was last updated in April 1996.

Time SeriesGeospatialSeismic ArrayGeophysicsSeismologyEarth Structure+1

0 views

Speech & Audio

Uzbek YouTube Speech: 100K+ Transcribed Audio Segments via Gemini 2.0

Between 100,000 and 1,000,000 Uzbek language audio segments and transcriptions sourced from YouTube by openbank-uz in early 2026. The collection utilizes vocal isolation to separate speakers and Google's Gemini 2.0 Flash model for automated transcription.

AudioArrowGeminiText To SpeechTask Categoriestext To SpeechYoutubeLanguageuzModalityaudioModalitytextSize Categories100 Kn1 MLibrarymlcroissantLibrarydatasetsLicensecc By Nc 40RegionusTask Categoriesautomatic Speech RecognitionSpeech RecognitionUzbek+1

0 views

Speech & Audio

Massachusetts Tree-Ring Chronology from 373 to 181 BP

Tree-ring width measurements provide a proxy climate record for a site in Massachusetts, USA. The chronology covers a 192-year period from 373 to 181 calendar years before present. Data is archived by NOAA's National Centers for Environmental Information World Data Service for Paleoclimatology.

Time SeriesGeospatialTree RingPaleoclimatologyDendrochronologyClimate Reconstruction+1

0 views

Speech & Audio

Massachusetts Tree Ring Chronology from 455 to 279 BP

Tree ring width measurements from a site in Ipswich, Massachusetts, provide a 176-year climate record from 455 to 279 calendar years before present. The data is archived by the NOAA National Centers for Environmental Information under its World Data Service for Paleoclimatology. This specific chronology is part of the International Tree-Ring Data Bank.

Time SeriesGeospatialTree RingPaleoclimatologyNorth AmericaClimate Reconstruction+1

0 views

Speech & Audio

Dorchester Tree Ring Chronology from 588 to 268 Years Before Present

Tree ring width measurements from the Pierce House site in Dorchester, Massachusetts, used for paleoclimate reconstruction. The chronology covers a 321-year period from 588 to 268 calendar years before present. Data is archived and provided by the NOAA National Centers for Environmental Information World Data Service for Paleoclimatology.

Time SeriesGeospatialMassachusettsTree RingPaleoclimatologyClimate Reconstruction+1

0 views

PreviousPage 54 of 130Next