Loading...
Loading...
Speech recognition, text-to-speech, speaker identification, music classification, audio event detection
2,021 datasets
Dnipro, Ukraine, provides data on the consumption of communal resources by the city's communal institution of culture, Dnipro Children's Music School No. 10. The dataset likely contains utility usage metrics, such as water or electricity consumption, for the school. It was published on the States site of Ukraine and last updated on December 3, 2021.
Gujarati speech recordings and transcriptions categorized for Automatic Speech Recognition (ASR). This dataset provides audio-text pairs sourced from the OpenSLR repository to facilitate public access to Gujarati language resources.
Comprising audio files for automatic speech recognition (ASR). It is categorized as containing under 1,000 samples and is associated with the US region. The dataset was last updated in January 2022.
84 hours of Sanskrit audio data for training automatic speech recognition models, uploaded by user 'addy88' to Hugging Face in December 2021. The dataset is categorized as containing 10K to 100K samples and includes text transcriptions.
Petersham, Massachusetts hosts ground-based soil moisture, soil temperature, and air temperature measurements from twenty-five temporary stations. The stations were installed across an area of approximately 23 km by 36 km in May 2019 and operated through 2022. The dataset is produced by NSIDC_CPRD and was last updated in October 2021.
An evaluation dataset for Automatic Speech Recognition (ASR) systems in the Sanskrit language. The dataset was created by user 'addy88' and published on the Hugging Face platform in December 2021. Its specific size and structure are not detailed in the provided metadata.
3,000 semi-natural Persian speech utterances totaling 3 hours and 25 minutes of audio extracted from online radio plays. The collection features 87 native speakers expressing five primary emotional states including anger, fear, happiness, and sadness.
1 Danish speech dataset from Sprakbanken featuring audio recordings sampled at 16kHz. The collection provides acoustic data specifically for the Danish language to support speech recognition and linguistic research.
Audio snippets paired with text transcriptions, sourced from book audio recordings. The dataset was created by JesseParvess and uploaded to Hugging Face in December 2021. Platform tags indicate it contains text and audio modalities for speech recognition tasks.
The LibriSpeech corpus contains approximately 1000 hours of read English speech, sampled at 16 kHz. It was prepared by Vassil Panayotov with assistance from Daniel Povey, derived from audiobooks in the LibriVox project.
The dataset contains the financial statements of the City Municipal Institution of Culture 'Dnipro Children's Music School Ü10'. It was published on the States site of Ukraine and last updated on October 20, 2021. The data is provided in an Excel (.XLSX) file format.
A dataset for Semantic Role Labeling (SRL) constructed from the Multi-News summarization corpus. It was created by the author 'rubenwol' and uploaded to the Hugging Face platform in November 2021. The dataset applies the Question Answer driven Semantic Role Labeling (QA-SRL) framework to news articles.
A 2021 update of a geospatial dataset mapping the sound classification of railway and tramway infrastructure in the Hérault department of France. The classification, established by prefectoral decrees in 2014 and 2007, categorizes land transport infrastructure into five noise levels and defines affected areas on either side of the tracks. The data is provided by the Bureau de Recherches Géologiques et Minières (BRGM) as a Web Map Service (WMS).
A French departmental map service identifies land sectors impacted by noise from major transport infrastructure, as mandated by national law. The dataset is based on a prefectural classification of roads with over 5,000 vehicles per day, intercity rail lines with over 50 trains daily, and public transport lines with over 100 buses. It was last updated by the Bureau de Recherches Géologiques et Minières on September 3, 2021.
BUREAU DE RECHERCHES GÉOLOGIQUES ET MINIÈRES provides a dataset mapping the sound classification of land transport infrastructure in Maine-et-Loire department, France. The classification is mandated by French law (Law No. 92-1444 and the Environmental Code) and identifies sectors affected by noise based on traffic characteristics. The dataset was last updated on 2021-09-03.
A financial report details the consumption of communal resources for the City Municipal Institution of Culture 'Dnipro Children's Music School Ü14'. The data covers the period from January to August 2021 and was published on the States site of Ukraine. The dataset was last updated on September 9, 2021.
Featuring multi-speaker, high-quality transcribed audio data for the Sinhala language, consisting of wave files and a TSV file. The data was manually quality checked and was collected by Google in Sri Lanka and contributed by the Path to Nirvana organization.
Librispeech Local Dummy is an audio dataset for English speech recognition, hosted on Hugging Face by patrickvonplaten. The dataset was last updated on September 28, 2021. Specific details on size, row count, and recording methodology are not provided in the available metadata.
The LibriSpeech corpus contains approximately 1000 hours of read English speech audio, sampled at 16 kHz. It was prepared by Vassil Panayotov with assistance from Daniel Povey, derived from audiobooks in the LibriVox project.
Financial report of the City Municipal Institution of Culture 'Dnipro Children's Music School No. 14' on communal expenditures for the first half of 2021. The dataset was published on the States site of Ukraine open data platform on 2021-07-04. It likely contains detailed records of utility payments for a municipal cultural institution.