DataSalon

Discover quality datasets for AI training — aggregated from 40+ platforms, curated by AI.

ProductSearch Datasets Browse Topics Rankings Community API / MCP

ResourcesDocumentation Blog Changelog Status

LegalPrivacy Policy Terms of Service Cookie Policy

Speech & Audio Datasets | DataSalon

All Categories

🎤

Speech & Audio

Speech recognition, text-to-speech, speaker identification, music classification, audio event detection

2,587 datasets

Speech & Audio

Massachusetts State Employee Payroll with Position and Department Details

Commonwealth Of Massachusetts Payrollv3 is a production dataset from the Statewide Payroll site. It contains payroll information for state employees, as indicated by columns such as NAME_FIRST, NAME_LAST, POSITION_TITLE, DEPARTMENT_DIVISION, and various pay-related fields. The data is hosted by cthru.data.socrata.com and was last updated on 2026-03-17.

TabularCSVXMLJSONEmployee CompensationMassachusettsProdPublic SectorGovernment Payroll+1

0 views

Speech & Audio

Mooré Language Audio-Text Corpus from Religious Sources

Mooré Speech Bible is a curated collection of aligned audio and text data in the Mooré language (ISO 639-3: mos), gathered from publicly available religious sources. The dataset is intended for research in low-resource speech and language processing. It was created by goaicorp and was last updated in April 2026.

AudioMultimodalSpeech SynthesisMoor LanguageNatural Language ProcessingAudio Text AlignmentLow Resource LanguageSpeech Recognition+1

0 views

Speech & Audio

VozBR-BrandVoice: Brand-Voice Compliance Pairs for Brazilian Portuguese

Brand-voice compliance pairs for Brazilian Portuguese institutions. The dataset appears to contain text pairs, likely for training or evaluating models on brand voice consistency. The author, organization, size, and last update date are unknown.

TextAudioText PairsComplianceBrand VoiceBrazilian Portuguese+1

0 views

Speech & Audio

Supplementary materials for the article "Perceiving musical interaction between digital an

Supplementary materials for the article "Perceiving musical interaction between digital and acoustic instruments: A case study with the Karlax". The repository contains raw data results from a free categorization experiment and the video stimuli used in that experiment. Author Linglan Zhu and collaborators prepared this data, which was last updated on April 25, —.

TabularAudioVideoAcoustic InstrumentsMusic PerceptionDigital InstrumentsExperimental DataHuman Computer Interaction+1

0 views

Speech & Audio

Vaja-Thai: Combined Thai Speech Dataset for TTS Research, 554.6 Hours

Vaja-Thai is a unified Thai speech dataset containing 289,916 audio samples totaling 554.6 hours for Text-to-Speech research. The dataset was created by dubbing-ai and last updated in April 2026. All audio is resampled to 24 kHz WAV format and combines multiple quality-filtered sources.

AudioText To SpeechAudio DatasetSpeech SynthesisThai Language+1

0 views

Speech & Audio

Massachusetts Shellfish Landings by Reporting Area, 1990-2001

25 distinct statistical reporting areas cover a large portion of the Gulf of Maine and south, including Massachusetts territorial waters. These data represent commercial shellfish landings in bushels, recorded by species, harvest location, and year from 1990 to 2001. Records originate from the Massachusetts Division of Marine Fisheries' Commercial Shellfish Landings Database and are used for mapping and annual publications.

TabularTime SeriesGeospatialFisheriesShellfishFinanceMarine Biology+1

0 views

Speech & Audio

Nuer Language Text-to-Speech Data

A speech synthesis dataset for the Nuer language, published on Kaggle. The dataset likely contains audio recordings paired with corresponding text transcripts. Specific details on size, collection method, and origin are not provided in the available metadata.

TextAudioText To SpeechNuer LanguageSpeech SynthesisLow Resource Language+1

0 views

Speech & Audio

TTS API Test Workspace: Audio Samples for Speech Synthesis

A workspace for testing Text-to-Speech APIs, likely containing audio samples or related metadata. The dataset is hosted on Kaggle and is associated with platform tags for speech synthesis and API testing. Specific details on the data volume, creation date, and author are not provided in the available metadata.

AudioText To SpeechSpeech SynthesisApi Testing+1

0 views

Speech & Audio

1DamnAudio: 221,929 Urdu Audio-Text Pairs for Speech Synthesis

1DamnAudio is a large-scale Urdu text-to-speech dataset created by mahwizzzz and last updated on 2026-03-18. It contains 221,929 audio-text pairs totaling 208.5 hours of speech. The audio files are 16kHz WAV format with an average duration of 3.38 seconds.

AudioText To SpeechUrdu SpeechSpeech CorpusLarge ScaleAudio Synthesis+1

0 views

Speech & Audio

AI-Powered Music Recommendation System for Emotion Detection

A dataset for personalized music suggestions based on emotion detection. The description indicates it is designed for an AI-powered recommendation system. The dataset's specific size, origin, and update history are not provided.

TabularAudioEmotion DetectionMusic RecommendationPersonalization+1

0 views

Speech & Audio

CaptionAI Indonesian ASR Dataset

An audio dataset for Indonesian Automatic Speech Recognition (ASR) tasks, published on Kaggle. The dataset likely contains speech recordings and corresponding transcriptions. Specific details on size, collection method, and origin are not provided in the available metadata.

AudioSpeech CorpusIndonesian LanguageAutomatic Speech Recognition+1

0 views

Speech & Audio

Mooré Spoken Riddles Dataset for Speech Technology

A spoken collection of traditional Mooré riddles created for academic and research use in low-resource speech and language technologies. The dataset was created by goaicorp and was last updated in April 2026. It is designed to support work in text-to-speech, automatic speech recognition, and oral tradition modeling for the Mooré language.

TextAudioOral TraditionAfrican LanguagesLow Resource LanguageSpeech Recognition+1

0 views

Speech & Audio

Mooré and French Proverbs Parallel Audio-Text Corpus

Goai Moore Speech Proverbs is a bilingual audio-text corpus of traditional proverbs in Mooré and French. Created by goaicorp, it is designed for research in low-resource speech and language processing. The dataset was last updated in April 2026.

TextAudioSpeech ProcessingNatural Language ProcessingAfrican LanguagesBilingual CorpusLow Resource Language+1

0 views

Speech & Audio

Mooré and French Proverbs Parallel Audio-Text Corpus

Goai Moore Speech Proverbs is a bilingual audio-text corpus containing traditional proverbs in the Mooré and French languages. It was created by goaicorp for research in low-resource speech and language processing, with the dataset page last updated in April 2026. The dataset is specifically designed for academic purposes in text-to-speech and automatic speech recognition for Mooré.

TextAudioProverbsSpeech CorpusNatural Language ProcessingAfrican LanguagesBilingual TextLow Resource Language+1

0 views

Speech & Audio

Zero-Shot Text-to-Speech Robustness Benchmark Across Acoustic Regimes

A 2026 benchmark from KRAFTON provides 6,000 prompt–text pairs for evaluating zero-shot text-to-speech models. It covers four acoustic regimes: Clean, Noisy, Wild, and Emotional, using prompts from 12 different datasets. This framework aims to assess model robustness in realistic and challenging recording scenarios.

TabularAudioEnglishCSVSize Categories1 Kn10 KText To SpeechTask Categoriestext To SpeechLibrarypolarsLanguageenLicensecc By Nc Nd 40Speech SynthesisModalitytextZero Shot LearningModalitytabularLibrarymlcroissantEvaluationLibrarydatasetsBenchmarkLibrarypandasZero Shot TtsRegionusRobustnessRobustness Evaluation+1

0 views

Speech & Audio

Traditional Mooré Folk Stories Speech Corpus

A spoken corpus of traditional Mooré folk stories (contes) designed for low-resource language research. The dataset was created by goaicorp for academic purposes and was last updated in April 2026. Access to the data is gated and requires a request.

TextAudioSpeech CorpusMoor LanguageNatural Language ProcessingFolk StoriesLow Resource Language+1

0 views

Speech & Audio

Traditional Mooré Folk Stories Speech Corpus

A spoken corpus of traditional Mooré folk stories (contes) designed for research in low-resource speech and language processing. The dataset is created by goaicorp for academic purposes and was last updated in April 2026.

TextAudioSpeech CorpusMoor LanguageNatural Language ProcessingFolk StoriesLow Resource Language+1

0 views

Speech & Audio

TTS Payload Topic: The Science of Cold Exposure

A dataset concerning the science of cold exposure, likely containing structured information on physiological responses or experimental parameters. It originates from Kaggle, a platform for sharing datasets. The specific content, size, and creation details are not provided in the metadata.

TabularCold ExposureHealth SciencePhysiology+1

0 views

Speech & Audio

Ytmusics: Audio Data from YouTube Music

Ytmusics is a dataset hosted on HuggingFace by NathMen12. The dataset's specific content and structure are not described in the available metadata. It was last updated on 2026-05-18 18:04:45.

AudioMusic RecommendationYoutube Music+1

0 views

Speech & Audio

Arabic Saudi Multi-Speaker TTS Dataset in LJSpeech Format

A dataset for training Text-to-Speech models, including XTTS_v2, YourTTS, and Tacotron. It contains audio in the LJSpeech format, featuring multiple speakers of the Saudi dialect of Arabic. The dataset was created by Abdelrahman2922 and was last updated on March 30, 2026.

AudioText To SpeechSaudi DialectArabic LanguageMulti SpeakerAudio Synthesis+1

0 views

PreviousPage 38 of 130Next