Loading...
Loading...
Speech recognition, text-to-speech, speaker identification, music classification, audio event detection
1,909 datasets
A digital geologic-GIS map of the Wetherill Mesa Quadrangle in Colorado, adapted from a 1999 National Park Service geologic map. The dataset includes GIS data layers and tables available in file geodatabase and geopackage formats, along with ancillary PDF documents containing unit descriptions and metadata. It was produced by the National Park Service's Geologic Resources Inventory program.
A National Park Service Geologic Resources Inventory digital map of the Cortez Quadrangle, Colorado, derived from a 1999 source map. The dataset includes GIS data layers, tables, and ancillary documents like unit descriptions. Based on a source map scale of 1:24,000, features have a stated horizontal locational accuracy within 12.2 meters or 40 feet.
A digital geologic-GIS dataset for Mesa Verde National Park and vicinity, Colorado, composed of GIS data layers and tables. The data were completed as a component of the National Park Service's Geologic Resources Inventory program, adapted from source maps by Griffitts (1999). It is available in multiple GIS formats including a file geodatabase, OGC geopackage, and KMZ/KML for Google Earth.
June 2006 GIS data showing potential or developing locations for desalination plants along the Massachusetts coast. The data are preliminary and speculative, compiled from public media reports, state/federal regulatory filings, local meeting proceedings, and private contractor studies. The dataset was aggregated by SCIOPS and is hosted on the nasa_earthdata platform.
Point-based GIS data showing the locations of marinas, yacht clubs, boat yards, and related facilities along the Massachusetts coast. The data were compiled in 2007 from public lists, databases, and visual inspection of orthoimagery by the organization SCIOPS. All data are represented as points with associated attribute data and include facilities defined as catering to recreational yachtspersons.
Massachusetts Bay hosts a geospatial data layer detailing a proposed 16-mile, 24-inch diameter natural gas pipeline lateral. The dataset, created by SCIOPS, represents the project layout as of September 27, 2005, based on surveys using DGPS, multibeam, sidescan, and diver inspections. It was surveyed according to US Army Corps of Engineers standards (EM 1110-2-1003).
A geospatial line feature representing the western portion of a 46kV electric supply cable from Hyannis, Cape Cod to Nantucket Harbor. The data was created by SCIOPS using geographic coordinates from a National Grid drawing dated February 2005, incorporating marine survey data from 1994.
Geospatial data from 1994 and 1996 shows the location of a 46kV submarine electric supply cable between Harwich Port, Cape Cod and Nantucket Harbor, Massachusetts. The line feature was created by SCIOPS using coordinates from a 1996 National Grid drawing and incorporates data from a marine survey conducted in 1994.
A collection of characteristic FTIR peaks (cm⁻¹) for cigarette butts and beach sand, categorized by beach use zones. The data is provided in an XLSX file of 16.8 KB, authored by Claudia Díaz-Mendoza and last updated in March 2026.
Voicedesign3 is a Vietnamese text-to-speech dataset created by ShiniChien. The dataset is synthesized, meaning the audio was likely generated by a TTS model rather than recorded from human speakers. It was last updated on HuggingFace on April 24, 2026.
Mono Segments contains over 310,000 multi-instrumental MIDI files selected from the Discover MIDI Dataset. The dataset is enriched with lead monophonic melodies and high-precision structural segment labels, created by author asigalov61.
STOMA is a multi-speaker Greek speech corpus containing approximately 23 hours of studio-recorded read speech. It features audio from six native speakers (three male and three female), captured under controlled studio conditions to ensure high signal quality.
CoversBR is a large audio database focused predominantly on Brazilian music for cover and live song identification tasks. It comprises metadata and extracted features from 102,298 songs, organized into 26,366 cover groups, totaling approximately 7,070 hours of audio. The dataset is provided by Dirceu G Silva via AWS Open Data, but the original audio files are not included due to copyright restrictions.
ESRI shape files from the National Park Service Land Resources Division detail property ownership and interests. The data is intended for displaying NPS-owned lands and areas with scenic easements or rights of way. It was last updated on March 4, 2026.
A conversational speech dataset of simulated mental health counselling sessions in Luganda, recorded in Uganda. It features dialogues between a Helper (counsellor) and a Seeker (client) discussing mental health topics. The dataset is designed for research in automatic speech recognition, speaker diarization, and speaker role classification for a low-resource African language.
405 adults aged 25 to 40 living in Newton, Massachusetts, were interviewed for this study on relative deprivation. The data, collected by ABT Associates of Cambridge, includes demographic information, job details, domestic arrangements, attitudes toward women's work, and depression scale scores. The study was designed by Faye J. Crosby to compare housewives and employed men and women in high and low prestige occupations.
A corpus of orthographically transcribed broadband speech for Sesotho, one of South Africa's eleven official languages. It was created by researcher Febe de Wet and the NCHLT project, with transcriptions provided in XML format. The dataset was last updated in March 2026.
Orthographically transcribed broadband speech for Afrikaans, one of South Africa's eleven official languages. Transcriptions are provided in XML format. The corpus was authored by Febe de Wet and was last updated in March 2026.
Orthographically transcribed broadband speech is provided for each of South Africa's eleven official languages. Transcriptions are available in XML format. The corpus was authored by Laura Martinus and last updated in March 2026.
Kaggle hosts an Emotional Speech Dataset. It contains acoustic, speaker, and emotion-based features for adaptive speech emotion analysis. The author, organization, and specific data scale are not provided in the input metadata.