Sign in to view source links and access this dataset
Description
Chilean Spanish audio data consisting of 7 hours of transcribed, high-quality sentences recorded by 31 volunteers. The dataset was created by ylacombe and restructured from original OpenSLR archives for easier streaming. It was last updated on November 27, 2023.
Use Cases
Train a Text-To-Speech (TTS) model based on the high-quality Chilean Spanish audio recordings.
Develop Automatic Speech Recognition (ASR) systems based on the transcribed sentence data.
Benchmark speech synthesis quality for Chilean Spanish dialects based on the volunteer-recorded samples.
Study phonetic or prosodic features of Chilean Spanish based on the sentence-level audio.
Strengths
7 hours of transcribed audio provides a substantial base for model training.
High-quality audio recordings suggest good signal fidelity.
Data from 31 volunteers may offer some speaker diversity.
Limitations
Column-level documentation is absent; field semantics must be inferred after download.
Row count is unknown, which may limit suitability assessment.
The dataset's size and demographic details of the 31 volunteers are unspecified, which may obscure potential biases.
Provenance
Source
OpenSLR, restructured by ylacombe.
Collection Method
Sentences recorded by 31 volunteers.
Time Range
null
Freshness
Last updated 2023-11-27 11:42:55; freshness should be verified.
Geography
Chile (Chilean Spanish)
License is unknown; restrictions should be verified before use.