Sign in to view source links and access this dataset
Description
556,667 audio files totaling 1,024.71 hours of speech data, with an average clip length of 6.63 seconds. The dataset includes a breakdown of clips by speaker, with the top contributor, 'Despina', accounting for 60,150 clips or 11.5% of the total duration. It was uploaded by 'setfunctionenvironment' to Hugging Face and last updated on July 18, 2025.
Use Cases
Training automatic speech recognition (ASR) models based on the large volume of short audio clips.
Developing speaker diarization or identification systems based on the provided speaker breakdown and clip counts.
Benchmarking audio preprocessing pipelines based on the varied clip durations, from 0.41 to 44.97 seconds.
Analyzing speaker distribution and potential biases in speech data based on the provided top-speaker statistics.
Strengths
Large scale with over 556,000 audio files.
Substantial total duration of 1,024.71 hours.
Provides detailed speaker-level statistics, including clip counts and duration percentages.
Limitations
Description metadata is limited; actual data quality requires manual inspection after download.
Column-level documentation is absent; field semantics must be inferred after download.
The source, collection method, and license are unknown, limiting reproducibility and use-case assessment.
Provenance
Source
huggingface
Freshness
Last updated 2025-07-18 00:27:04; freshness should be verified.
License is unknown, which may restrict commercial or research use.