Loading...
Loading...
Speech recognition, text-to-speech, speaker identification, music classification, audio event detection
1,962 datasets
168,385 MIDI music files are paired with descriptive text captions and a set of extracted musical features. The captions were generated by the amaai-lab using a pipeline that extracts MIR features and employs the Claude 3 LLM for in-context learning captioning. The dataset was last updated in March 2025.
A 2004 prefectural decree defines acoustic zones for land transport infrastructure in Finistère, France, including a section straddling Morbihan. The Bureau de Recherches Géologiques et Minières provides these data as the latest available, segmented by municipality. The zones are not prohibitive but require facade isolation calculations for new buildings.
1.2k hours of high-quality Kazakh speech audio form the KSC2 corpus. The dataset, created by author 'issai', consolidates and supplements previous Kazakh speech and text-to-speech corpora with data from sources like TV programs, radio, senate sessions, and podcasts. It was last updated on the Hugging Face platform on February 28, 2025.
A collection of 3,992 audio clips of Kinyarwanda text-to-speech recordings made by a single voice actress in a studio setting. It was collected as part of the Mbaza project and includes a CSV file linking audio file names to their corresponding written text.
Encompassing transcriptions and metadata for approximately 398 hours of Ukrainian speech audio from the VOA Ukrainian dataset. It was created by Yehor and updated in February 2025 for automatic speech recognition tasks.
Image data categorized into over 34 indoor scene classes including specialized environments like 'studiomusic', 'hospitalroom', and 'inside_bus'. It provides labeled examples for computer vision tasks focused on identifying specific architectural and functional interior spaces.
The FMA (Free Music Archive) Large dataset is a collection of audio files for music analysis. It is associated with a research paper published on arXiv in 2016 and is hosted on GitHub. The dataset's integrity can be verified using provided SHA1 checksums.
An aggregation of Type B strategic noise maps for national roads, created pursuant to European Directive 2002/49/EC and a French decree from 2006. The data represents sectors affected by noise as defined in prefectural sound classification boards and was aggregated using the QGIS MIZOGEO plugin from CEREMA. The dataset was last updated on May 3, 2021.
European Directive 2002/49/EC mandates the harmonised assessment of environmental noise exposure. This dataset is an aggregation of Type B strategic noise maps for national roads, created using the QGIS MIZOGEO plugin provided by CEREMA. The data was last updated on May 3, 2021, and originates from the French geological and mining research bureau (BRGM).
Packed with approximately 1000 hours of read English speech audio, prepared by Vassil Panayotov with assistance from Daniel Povey. It is derived from LibriVox audiobooks, segmented and aligned, with a 16 kHz sampling rate.
Turkish Neural Voice consists of 100,000 to 1,000,000 synthetic audio samples and transcriptions generated by erenfazlioglu using Microsoft Text to Speech services. Updated in November 2024, this collection provides paired audio-text data specifically for the Turkish language. The dataset is structured for speech synthesis and recognition tasks using neural voice technology.
35.9 hours of Vietnamese audio generated for text-to-speech applications. The text source is a collection of public domain novels and short stories by author Vu Trong Phung. The audio was synthesized using the Google Text-to-Speech offline engine on Android.
325 to -21 calendar years BP of tree ring width measurements from Seagull Lake, Minnesota. This paleoclimatology dataset, archived by NOAA NCEI's World Data Service, is a Tree Ring study used for reconstructing past climate conditions. The data was last updated in the NOAA system in 1971.
The Missouri to Massachusetts Broadband Seismometer Experiment deployed 18 broadband seismometers between two permanent stations, forming a linear 20-station array spanning 1740 km. The experiment, conducted by SCIOPS, aimed to study the core-mantle boundary, crust, mantle, and subducting slabs beneath the eastern U.S. The dataset was last updated in April 1996.
104 soil, wood, hay, and feed samples were collected from Hut Point, Cape Evans, and Cape Royds in Antarctica. Molecular DNA probes were used to detect anthrax DNA at a level of approximately 3-4 spores per gram dry weight. The samples were catalogued for RNA and DNA extraction by SCIOPS, with data last updated in 2004.
111 years of tree ring data from the Harvard Forest in Petersham, Massachusetts, covering the period from 53 to -58 calendar years before present. The dataset was archived by NOAA's World Data Service for Paleoclimatology and provides parameters for climate reconstruction. It was last updated in 2008.
2003 data from the National Oceanic and Atmospheric Administration's National Centers for Environmental Information (NOAA NCEI) provides digitized internal wave packets. These features were extracted from Synthetic Aperture Radar (SAR) imagery at a 1:350,000 scale. The dataset captures groups of waves occurring at density interfaces in the ocean, forced by tides over underwater topography.
Tree ring width measurements from a site in Colorado, United States, covering a period from 270 to -14 calendar years before present. The dataset is archived by the NOAA National Centers for Environmental Information (NCEI) World Data Service for Paleoclimatology. The associated study type is Tree Ring and the data was last updated in 1964.
Tree ring data from British Columbia, Canada, spans 505 calendar years from 490 to -15 years before present. This archived paleoclimatology study is maintained by NOAA's National Centers for Environmental Information under its World Data Service. The data was archived in 1965.
Tree ring width measurements from Nicollet Lake in Minnesota, USA, used for paleoclimate reconstruction. The chronology covers 244 years, from 223 to -21 calendar years before present. Data is archived by the NOAA National Centers for Environmental Information (NCEI) World Data Service for Paleoclimatology.