arg-spanish-tts is a unified, deduplicated speech corpus for Argentine Spanish (es-AR) containing 10,747 audio rows. The dataset was created by Kukedlc, who merged three public datasets and stripped cross-source duplicates. All audio is resampled to 24 kHz mono, totaling 12.18 hours from 65 unique speakers.
Use Cases
- Pre-training multi-speaker TTS models based on the unified corpus of 65 speakers.
- Fine-tuning TTS models to a target voice using the deduplicated Argentine Spanish audio.
- Benchmarking speech synthesis quality for the es-AR dialect based on the provided audio samples.
- Studying speaker characteristics and variation within Argentine Spanish based on the multi-source corpus.
Strengths
- Contains 10,747 audio rows, providing a substantial base for model training.
- Totals 12.18 hours of audio, which is a significant duration for speech data.
- Includes audio from 65 unique speakers, enabling multi-speaker modeling.
- Audio is uniformly processed to 24 kHz mono, ensuring consistency.
Limitations
- Column-level documentation is absent; field semantics must be inferred after download.
- Description metadata is limited; actual data quality requires manual inspection after download.
Provenance
- Source
- Kukedlc on Hugging Face.
- Collection Method
- Built by merging three public datasets and stripping cross-source duplicates.
- Freshness
- Last updated 2026-05-28 04:16:58; freshness should be verified.
- Geography
- Argentina (es-AR dialect).