Over 1,100 hours of Vietnamese speech data were collected from various social resources by author NhutP and last updated on April 25, 2025. The dataset includes a diverse representation of accents from northern, central, and southern Vietnam, as well as different dialects and speaking styles. This diversity is intended to enhance the training and evaluation of automatic speech recognition models.
Use Cases
- Training Vietnamese ASR models based on the diverse accent and dialect representation.
- Evaluating ASR model robustness based on the variety of speaking styles present.
- Benchmarking speech recognition accuracy across different Vietnamese regional accents.
- Fine-tuning pre-trained speech models for the Vietnamese language based on the social voice data.
Strengths
- Over 1,100 hours of speech data provides substantial volume for model training.
- Explicitly includes diverse accents (north, central, south), dialects, and speaking styles.
Limitations
- Column-level documentation is absent; field semantics must be inferred after download.
- Row count and file formats are unknown, which may limit suitability assessment.
Provenance
- Source
- NhutP on Hugging Face.
- Collection Method
- Collected from a variety of social resources.
- Time Range
- null
- Freshness
- Last updated 2025-04-25 08:24:55; freshness should be verified.
- Geography
- Vietnam, with representation from northern, central, and southern regions.