Loading...
Loading...
Speech recognition, text-to-speech, speaker identification, music classification, audio event detection
1,907 datasets
10,000 Hindi utterances across six Vistaar-derived parts provide a benchmark for speech-to-text systems. The dataset contains about 15.5 hours of 16 kHz mono WAV audio, each with a reference transcript and outputs from four ASR services. It was published by RinggAI and last updated in April 2026.
100-meter resolution gridded population estimates for Saint Kitts and Nevis, created using a Random Forest-based dasymetric redistribution method. The dataset provides annual estimates of the total number of people per pixel from 2015 to 2030 in GeoTIFF format. WorldPop produced this 2025 Alpha release version in September 2025.
Bibliothèque et Archives nationales du Québec (BAnQ) provides a dataset of all ISBN-registered documents published in Quebec since 2010, acquired through legal deposit, purchase, or donation. The collection includes musical scores, official publications, books, and show programs.
Advanced Placement exam participation counts for all Massachusetts public school students from 2007 onward. Data is disaggregated by test subject, student demographic group, and school district. The Massachusetts Department of Elementary and Secondary Education publishes this dataset.
128.6 hours of long-form conversational speech data designed as a benchmark for automatic speech recognition. The dataset features diverse English accents across 16 service domains, with conversations lasting 5–15 minutes. It was created by apptek-com and last updated on the Hugging Face platform in April 2026.
Financial records from the KZSMO "Music School No. 3" in the KMR region of Ukraine. The dataset contains acts and invoices from 2019 to the present, sourced from the States site of Ukraine. It was last updated on May 6, 2026.
Daily-updated information on music events in Brisbane, sourced from the Trumba Calendar API. The dataset includes details on event dates, costs, booking requirements, venues, and locations, with the external feed limited to the next 1,000 upcoming events.
1,000 upcoming events are listed in this daily-updated schedule for the Brisbane Festival, a major annual Australian arts festival held each September. The dataset is sourced from the Trumba Calendar API and maintained by Brisbane City Council, with the most recent update in March 2026.
A 9.5 KB Excel spreadsheet summarizing Rickettsia typhi strains, including their origin and references for macrolide target gene analysis. The dataset was authored by Weerawat Phuklia and last updated on April 27, 2026. Its small size suggests a focused collection of bacterial strain metadata.
WildElder is a speech dataset focused on elderly scenarios, containing raw audio and corresponding text annotations. The data was collected and cleaned from real-world environments to preserve diversity. The dataset is authored by Hui519 and was last updated on 2026-05-02.
The Global Internal Displacement Database (GIDD) from the Internal Displacement Monitoring Centre (IDMC) provides validated annual estimates of internal displacement. This dataset for Saint Kitts and Nevis includes figures for people living in displacement at year-end and counts of new displacement incidents. The data is licensed under CC-BY-3.0-IGO and was last updated on 2026-03-18.
An Egyptian Arabic dataset combining text and audio, annotated with emotions and speaker diarization. Created by OmarAhmedSobhy, it is designed for training Text-to-Speech and Automatic Speech Recognition models. The dataset was last updated on May 7, 2026.
HanMATE proteins are associated with various ASR MATE proteins in other plants based on more than 75% sequence similarity. The dataset was authored by Mohammad Nazmol Hasan and last updated on April 13, 2026. It is a 13.5 KB Excel file available under a CC-BY-4.0 license.
A benchmark dataset created by SkunkWorkLabs, last updated in May 2026, for evaluating Hindi automatic speech recognition (ASR) systems. It compares the performance of the SkunkWorks model against commercial providers like ElevenLabs, Deepgram, and Sarvam. The evaluation is conducted across six distinct subsets sourced from projects like AI4Bharat Kathbath, Mozilla Common Voice, and Google FLEURS.
CEAEval-D is a Mandarin speech dataset annotated for context-rich expressive appropriateness. It was released by TianRW in association with an ACL paper and is hosted on Hugging Face. The dataset was last updated on May 10, 2026.
ViVoice-34 is a Vietnamese speech dataset featuring audio recordings from speakers across 34 provinces of Vietnam. Each audio sample includes full transcripts and metadata about the speaker and content. The dataset was created by anonymous-vivoice34 and was last updated on Hugging Face in May 2026.
This dataset tracks humanitarian funding and disaster response actions in Saint Kitts and Nevis, managed by the International Federation of Red Cross and Red Crescent Societies (IFRC). It documents Emergency Appeals for large-scale disasters and Disaster Response Emergency Fund (DREF) allocations for smaller crises. The data is provided in CSV format and was last updated in March 2026.
51,021 pre-computed latent representations for Urdu utterances, designed to bypass audio decoding during TTS model training. The latents are derived from the Humair332/Urdu-munch-1 audio source using the Aratako/Semantic-DACVAE-Japanese-32dim codec at a 25 Hz frame rate. Author zuhri025 uploaded this dataset to Hugging Face in April 2026.
Climate TRACE provides annual and monthly greenhouse gas and air pollutant emission estimates for Saint Kitts and Nevis starting from 2015. The inventory covers country-level aggregates by sub-sector and gas, alongside source-level monthly data and confidence scores beginning in 2021.
Version 3.1 geo-located Delay Doppler Maps from the CYGNSS satellite constellation provide calibrated ocean surface scattering measurements. At most, 8 netCDF files are generated daily, typically from 6-8 spacecraft, with a latency of approximately 6 days from measurement. This dataset, produced by NASA, supersedes Version 3.0 with improved antenna gain patterns and corrections for radio frequency interference.