Loading...
Loading...
Speech recognition, text-to-speech, speaker identification, music classification, audio event detection
1,907 datasets
A dataset from the Colombian government's Land Restitution Unit (URT) shows the number of judicial sentences issued per municipality under the Ethnic Route. It includes data on resolved requests and covers municipalities designated as PDET (Territorial Development Programs). The data was last updated on May 18, 2026, and is provided by www.datos.gov.co.
Maleo Short 1.5H is a manually curated and rigorously annotated dataset designed to benchmark State-of-the-Art speaker diarization models. It focuses on complex, 'in-the-wild' media domains where models typically struggle, such as content with overlapping speech and sound effects. The dataset was created by maleo-ai and was last updated on Hugging Face in May 2026.
U–Pb zircon dating results from middle Permian tuffs in the Canning Basin of Western Australia, revealing an apparent conflict with established spore-pollen zonation. The dataset includes ages such as 267.04 ± 0.14 Ma from the Pittston SD-1 drillhole and comparative data from other core holes. It was published by Mory et al. in 2017 and is hosted by Geoscience Australia.
CT-GateNet, a hybrid neural network architecture, achieved classification accuracies of 98.72%, 89.42%, and 69.07% on the GTZAN, FMA-SMALL, and FMA-Medium music genre datasets, respectively. The 5.5 KB Excel file contains experimental datasets from this research, authored by Yunyan Ma and last updated in April 2026. The data is shared under a CC-BY-4.0 license on figshare.
A Malayalam-language subset of the Shrutilipi ASR corpus, originally curated by AI4Bharat. The dataset is a lightweight, language-specific version for researchers and developers focusing on Malayalam speech technology. It was uploaded by the author 'trysem' to Hugging Face.
HeiGIT generated this geospatial dataset for Saint Kitts and Nevis by applying deep learning to PlanetScope satellite imagery from 2020 and 2024. It maps surface types, width classes, and passability for approximately 100 km of arterial roads, including motorway, trunk, primary, and secondary classifications. The data supplements OpenStreetMap (OSM) attributes with AI-derived predictions to fill gaps in surface and width tagging.
A 59.3 KB CSV file from figshare, last updated April 2026. It contains gene expression data for the bacterial parasite Candidatus Aquirickettsia rohweri within the critically endangered coral Acropora cervicornis under ambient and nutrient-enriched conditions. The dataset was authored by Lauren Speare to investigate how nutrient enrichment influences parasite physiology and disease susceptibility.
Katie Schofield's MSc Psychology thesis data comprises interview transcripts from a research project on music performance. The dataset is 1.1 MB in size and was last updated on 2026-05-28. It is shared under a CC-BY-4.0 license on figshare.
Path analysis data on genetic polymorphisms (CaSR, CLDN14, VDR, ALPL) associated with kidney stone occurrence. The 59.0 KB XLSX file, authored by Widi Atmoko and last updated in May 2026, likely contains clinical variables and demographic characteristics for analysis. It is shared under a CC0-1.0 public domain license.
Data from a stepped wedge cluster randomised trial evaluating the FitSkills community-based physical activity intervention. The dataset supports the findings of a 2025 publication in the British Journal of Sports Medicine. It was authored by Nora Shields and colleagues and is available under a CC-BY-4.0 license.
RobotsMali's dataset is a collection of read Bambara text from educational children's books. It is designed to support the training and benchmarking of Automatic Speech Recognition models, with a focus on child speech and regional acoustics. The dataset is structured into two separate subsets for specialized training and evaluation.
18,000 query–response dialogue pairs across 12 emotion categories, intended for empathetic speech synthesis research. The dataset includes synthesized audio and text transcripts, with subsets for emotional and neutral queries. It was created by susameddin and last updated on Hugging Face in May 2026.
Oceanographic measurements collected in Massachusetts Bay and the surrounding area. The dataset covers a multi-year period from 2002 to 2005 and is provided by the National Aeronautics and Space Administration. Data includes parameters related to ocean chemistry, optics, temperature, and salinity.
NASA collected in-situ oceanographic data along the coastal regions of New Hampshire and Massachusetts during 2009. The dataset includes measurements related to ocean chemistry, optics, temperature, and salinity. It is available in BIN and ISO file formats.
SpeakerCard-1M is a speaker-centric corpus built on the VoxCeleb1 and VoxCeleb2 datasets. It was created by JYP2024 using a tool-first, LLM-last pipeline where ten acoustic probes extract evidence for a structured schema. The dataset was last updated on June 3, 2026.
Tajik ASR Corpus v0 is a deduplicated collection for automatic speech recognition assembled from multiple sources. The dataset, created by Peacockery, includes data from FLEURS-derived speech, Mozilla Common Voice 25 Tajik, and augmented data from Muhtasham Tajik ASR. Each data split is provided in TSV format with an audio directory, and a SQLite version includes additional normalized fields.
Ghana is the primary source for KasaSpeech, a large-scale speech dataset featuring natural switching between English and Twi. It contains 49,878 transcribed audio samples, split into training, validation, and test sets. The dataset was created by Kennethdot and last updated on Hugging Face in May 2026.
A domain-specific Pashto automatic speech recognition dataset covering agriculture, general topics, food services, health, and services. The dataset is structured by domain with audio files and corresponding transcript CSV files, created by Sabtain-Dev and last updated on June 5, 2026.
Mory et al. (2017) published zircon U–Pb ages from middle Permian tuffs in Western Australia's Canning Basin. The data reveals an apparent conflict between CA-IDTIMS ages and established spore-pollen zonation, with a specific age of 267.04 ± 0.14 Ma reported from the Pittston SD-1 drillhole. The dataset is hosted by the Australian Ocean Data Network and was last updated in April 2026.
MERIT is a dataset of audio triplets designed for training a framework that learns three independent music similarity spaces: melody, rhythm, and timbre. It was created by the AMAAI-Lab and is hosted on Hugging Face. The dataset page was last updated on 2026-05-26.