DataSalon

Discover quality datasets for AI training — aggregated from 40+ platforms, curated by AI.

ProductSearch Datasets Browse Topics Rankings Community API / MCP

ResourcesDocumentation Blog Changelog Status

LegalPrivacy Policy Terms of Service Cookie Policy

Speech & Audio Datasets | DataSalon

All Categories

🎤

Speech & Audio

Speech recognition, text-to-speech, speaker identification, music classification, audio event detection

2,602 datasets

Speech & Audio

Protocol for Audio Classification

A protocol for audio classification tasks, published on Kaggle. The dataset's specific content, size, and features are not detailed in the provided metadata. Its author, organization, and last update date are unknown.

AudioMachine LearningProtocolAudio Classification+1

0 views

Speech & Audio

UK Live Music Booking Rates for Weddings, Pubs, and Corporate Events in April 2026

2,381 observations of live-music booking rates in the United Kingdom from April 2026. The data covers bookings for weddings, pubs, and corporate events. The original source is the GX Index, and it was published on Kaggle.

TabularAudio🇬🇧 United KingdomLive MusicBooking RatesEntertainment Industry+1

0 views

Speech & Audio

UK Musician Earnings by Genre, Region, and Venue for 2026

GigXchange provides annual income data for working musicians in the United Kingdom for the year 2026. The dataset likely contains breakdowns by musical genre, geographic region, and primary performance venue type. It was sourced from the Kaggle platform, but the original author and specific collection methodology are unknown.

TabularMusic IndustryIncome DataUk EconomyMusician Earnings+1

0 views

Speech & Audio

HiEn: Parallel Speech Corpus for Translation Tasks

Parallel Speech dataset designed for the speech translation task. The dataset appears to contain aligned audio data in multiple languages. Its source, size, and specific creation details are not provided in the available metadata.

AudioNatural Language ProcessingSpeech TranslationAudio Corpus+1

0 views

Speech & Audio

TTS_PLShrimp: Text-to-Speech Dataset

A dataset titled 'TTS_PLShrimp' published on Kaggle. The name suggests it contains data for text-to-speech synthesis tasks. Metadata is minimal; the specific content, size, and origin require verification after download.

AudioText To SpeechSpeech SynthesisAudio Generation+1

0 views

Speech & Audio

Deepspeech Balalaika: Russian Speech Corpus for Generative Tasks

A curated Russian speech dataset for advanced speech generative tasks. The corpus was filtered and annotated by the lab260 team at MTUCI using the BALALAIKA pipeline. It includes genres such as podcasts, public speech, YouTube content, audiobooks, phone calls, and TTS.

AudioRussian LanguageSpeech SynthesisNatural Language ProcessingAudio CorpusSpeech Recognition+1

0 views

Speech & Audio

Sediment Toxicity Measurements from Massachusetts Bay and U.S. Coastal Waters, 1991-1994

Sediment toxicity data collected from small vessels in Massachusetts Bay and other U.S. coastal waters. The dataset was published by NOAA_NCEI and covers a period from March 1991 to May 1994. Its specific content and structure require verification after download.

TabularTime SeriesEnvironmental monitoringMarine ScienceCoastal DataSediment Toxicity+1

0 views

Speech & Audio

Slovak Female Speech Recordings for Text-to-Speech

Encompassing Slovak speech recordings from a female voice, suitable for text-to-speech (TTS) model training. It provides Slovak language transcripts and audio with a 48,000 Hz sample rate.

Licensecc By Sa 40Regionus+1

0 views

Speech & Audio

Llm System Ops Production Telemetry Sft Data: Synthetic LLM Operations Metrics

A synthetic, production-style telemetry dataset for LLM operations analytics. It was created by author tarekmasryo and last updated on 2026-02-09. The data is designed for monitoring cost, latency, token usage, failures, safety flags, tool usage, and user feedback across interaction, session, and user levels.

TabularTelemetryLlm OpsProduction AnalyticsSft DataSynthetic DataSynthetic+1

0 views

Speech & Audio

Sentinel-2 ACOLITE-DSF Aquatic Reflectance for the Conterminous United States

Sentinel-2 ACOLITE-DSF Aquatic Reflectance for the Conterminous United States is a dataset of unitless water-leaving radiance reflectance values. The data was produced by the United States Geological Survey using the ACOLITE software's dark spectrum fitting algorithm for atmospheric correction. The dataset covers the conterminous United States.

GeospatialAquatic ReflectanceWaterSatellite ImageryNatural ResourceEarth ObservationCog+1

0 views

Speech & Audio

Somali Asr Subset 68H: A Speech Dataset for Automatic Speech Recognition

Somali Asr Subset 68H is a speech dataset published on the Hugging Face platform by DDD-Kenya. The dataset's title suggests it contains audio data for the Somali language, likely intended for automatic speech recognition tasks. The record was last updated on March 19, 2026, but detailed metadata about its size, format, and contents is unavailable.

AudioAudio DatasetAfrican LanguagesSomali LanguageSpeech Recognition+1

0 views

Speech & Audio

Nigerian Accent Speech Data for English and Nigerian Pidgin

A speech dataset containing recordings of Nigerian-accented English and Nigerian Pidgin, intended for research and development in speech technology. The dataset includes audio files paired with transcriptions. It was authored by AlaminI and last updated on February 8, 2026.

AudioText To SpeechAfrican EnglishPidginAccent IdentificationSpeech Recognition+1

0 views

Speech & Audio

VocalSet: 10.1 Hours of Professional Singing Recordings

VocalSet is a singing voice dataset containing 10.1 hours of monophonic audio recordings. It features 20 professional singers (9 male, 11 female) performing standard and extended vocal techniques on five vowels. The dataset was created by Julia Wilkins to support singing voice research.

AudioAcousticsComputer ScienceSinging VoicePhysicsSingingSpeech Recognition+1

0 views

Speech & Audio

Cold War Crisis Analysis of Military and Civilian Influence

A book analysis of military and civilian influence on decisions regarding the use of force in U.S. foreign affairs. The work examines twenty intervention decisions and ten escalation decisions during crises, including cases in Korea, Berlin, Cuba, and Vietnam. An updated edition includes a preface and epilogue discussing recent cases and declassified information.

TextHistoryEngineeringComputer ScienceMilitary HistoryPsychologyCold WarIntervention CounselingOperations ResearchLawVariety CyberneticsArtificial IntelligenceCommunismCrisis AnalysisEconomic HistoryPolitical SciencePolitics+1

0 views

Speech & Audio

Innovation in Accompanist Music for Wayang Kulit Performance

This dataset examines changes in accompanist music for Wayang Kulit (leather puppet) performances from a performing art management perspective. The research focuses on innovations driven by a need to engage younger audiences in Java and Bali, Indonesia. Specific data dimensions such as row count, column count, and sample data are not provided.

MarketingMusicInnovationOtherart Management+1

0 views

Speech & Audio

Innovation in Wayang Kulit Accompanist Music for Performing Art Marketing

This dataset examines changes in accompanist music for Wayang Kulit (leather puppet) performances in Java and Bali, Indonesia, from a performing art management perspective. It was authored by Setyabudhi R. Situmorang and last updated in February 2026.

MarketingMusicInnovationOtherart Management+1

0 views

Speech & Audio

Isolated Guitar Chord Recordings For Real-World Audio Classification

Isolated guitar chord recordings are designed for audio classification tasks like chord recognition and real-time music analysis. The dataset was recorded on a Fender FA-15 3/4 acoustic guitar in realistic acoustic conditions, including minor background sounds, to improve inference robustness. The author is rodriler, with a last recorded update in February 2026.

AudioChordsMusic AnalysisAcoustic InstrumentChord RecognitionGuitar+1

0 views

Speech & Audio

Emotion TTS: Text-to-Speech Data with Emotional Labels

Published on huggingface by author skit-ai and last updated on 2026-03-25. The dataset, titled 'Emotion Tts', likely contains audio samples and associated metadata for text-to-speech synthesis. Its specific content, scale, and structure require verification after download.

AudioText To SpeechSpeech SynthesisEmotionAudio Generation+1

0 views

Speech & Audio

Deepfake Audio Dataset for Real vs Fake Speech Detection

A speech dataset designed for deepfake audio detection, containing both real and fake audio samples. The dataset was sourced from Kaggle, but the author, organization, and specific collection details are unknown. The total size, number of samples, and last update date are not provided.

AudioAudio ForensicsFake NewsSpeech AnalysisDeepfake Detection+1

0 views

Speech & Audio

Nirantar: 22-Language Indian Speech Dataset

22 Indian language speech subsets provided in Parquet format for the Hugging Face ecosystem. The collection includes language-specific configurations for modular access to audio data and transcriptions sourced from the AI4Bharat Nirantar project.

LanguagedoiTask Categoriestext To SpeechLanguagemaiLanguagebnLanguagemlLanguageneLanguagesaLanguagehiLanguagebrxLanguagesdLanguageknTask Categoriesautomatic Speech RecognitionLanguageorLanguagepaLanguagekokLanguagemrLanguagemniLanguageksLanguagesatLanguagegu+1

0 views

PreviousPage 67 of 130Next