Loading...
Loading...
Speech recognition, text-to-speech, speaker identification, music classification, audio event detection
1,909 datasets
Between 100,000 and 1,000,000 Spanish audio segments and transcriptions derived from LibriVox audiobooks. Created by Cnam-LMSSC and updated in March 2026, it extends the Multilingual LibriSpeech (MLS) corpus with machine-generated phonetic transcriptions.
UK Live Music Booking Rates 2026 (May) contains 3,847 booking rates for live music in the UK. The data is organized by city, band size, and event type. The dataset was sourced from Kaggle, but the author, organization, and specific collection method are unknown.
32,267 audio samples totaling 103.18 hours of Vietnamese speech, curated for automatic speech recognition. The dataset, created by thanhnew2001, was last updated in February 2026. It is structured into 29,041 training and 3,226 development samples.
Tts Male 70H is a text-to-speech dataset published on HuggingFace by user vfdanil. The title suggests it contains audio samples of a male voice, likely for speech synthesis tasks. The dataset was last updated on April 22, 2026.
Kaggle hosts a dataset titled 'Noise reduction'. The dataset's content, size, and specific source are not detailed in the provided metadata. Its last update date and licensing information are also unknown.
Ayf3 published the Numberblocks One Voice Dataset on Hugging Face in April 2026. The dataset likely contains audio recordings related to the Numberblocks children's media franchise. Its specific content, size, and structure require verification after download.
Sarah Lyons Watts's book presents a psychological portrait of the 26th U.S. president, Theodore Roosevelt. The work analyzes his personal obsession with masculinity and its influence on national politics, as noted by contemporary figures like Woodrow Wilson. It is a textual analysis of Roosevelt's legacy, sourced from the paperswithcode platform.
Coastal Massachusetts dive sites include reefs, wrecks, jetties, and breakwaters popular for SCUBA diving. Data points were compiled by the Massachusetts Office of Coastal Zone Management from the Board of Underwater Archaeological Resources and dive club listings. The Massachusetts Office of Coastal Zone Management updated this layer on July 2, 2007.
Point locations document federal dredge projects by the US Army Corps of Engineers along the Massachusetts coastline. The data is historical, with records up to 16 December 1998. The dataset was compiled by the organization SCIOPS.
Polygonal extents document federal dredging projects by the US Army Corps of Engineers along the Massachusetts marine coastline. The dataset includes navigational channels, anchorages, harbors, beaches, and dikes, with records historical to December 16, 1998. It was compiled by the organization SCIOPS.
Geospatial arcs represent commercial, charter, and recreational boating uses within the Massachusetts Coastal Zone. The data delineates three distinct activity subtypes. It was compiled by SCIOPS from expert workshops held in Boston and Waquoit in June 2005.
November 2000 REMOTS survey data for Buzzards Bay, Massachusetts, collected for the Massachusetts Coastal Zone Management Agency's Dredged Material Management Plan. The dataset represents the complete analyzed set of imagery from that survey. It was produced by the organization SCIOPS.
National Park Service GIS layers compiled for a Baseline Water Quality Data Inventory and Analysis Report for Scotts Bluff National Monument. The data includes locations of water quality monitoring stations, industrial discharges, drinking intakes, gages, and impoundments, sourced from six EPA databases. Base layers such as roads, hydrography, and political boundaries are included, generally at a scale of 1:100,000.
EchoTTS Omnivoice En 20K is a speech synthesis dataset authored by SynDataLab and hosted on Hugging Face. The dataset was last updated on April 15, 2026. Its specific content and scale are not detailed in the available metadata.
Curb Space Categories maintained by the Seattle Department of Transportation. The dataset is refreshed daily and includes a feature class labeled SDOT.CURB_SPACES. An update on April 14, 2025 added a new category called 'MVZ' (Music Venue Zone).
California data tracks the primary spoken language of applicants for Insurance Affordability Programs, sourced from the CalHEERS system. The dataset covers 13 specific languages including English, Spanish, Vietnamese, and Cantonese. It supports public reporting requirements under the California Welfare and Institutions Code.
A survey dataset examines mental health challenges within elementary music education in the United States. It was created by Hamidreza Niknampour and uploaded to figshare in March 2026. The dataset is 25.0 KB in size, indicating a limited scope.
A 600-hour speech dataset of Ghanaian English extracted from Ghanaian news media broadcasts, designed for training Automatic Speech Recognition models on West African accents. The dataset was created by the ghananlpcommunity and was last updated on March 6, 2026. It contains audio segments with verbatim transcriptions and duration metadata.
Pantheon 1.0 measures the global popularity of historical characters using two metrics derived from Wikipedia. The simpler metric (L) counts the number of language editions with an article about a figure, while the Historical Popularity Index (HPI) adjusts for age, page view concentration, and cross-language views. The dataset was developed by the Macro Connections group at the MIT Media Lab.
GigaSpeech is a multi-domain English speech recognition corpus containing 10,000 hours of high-quality labeled audio released by SpeechColab in 2021. The data is aggregated from audiobooks, podcasts, and YouTube, capturing a mix of read and spontaneous speaking styles across topics like arts, science, and sports.