DataSalon

Discover quality datasets for AI training — aggregated from 40+ platforms, curated by AI.

ProductSearch Datasets Browse Topics Rankings Community API / MCP

ResourcesDocumentation Blog Changelog Status

LegalPrivacy Policy Terms of Service Cookie Policy

VietSpeech: Over 1,100 Hours of Vietnamese Social Voice Data | DataSalon

Home Speech & AudioVietSpeech: Over 1,100 Hours of Vietnamese Social Voice Data

Speech & Audio

VietSpeech: Over 1,100 Hours of Vietnamese Social Voice Data

Name: VietSpeech: Over 1,100 Hours of Vietnamese Social Voice Data
Creator: NhutP
Published: 2024-08-28T23:27:31
Keywords: Audio Data, Accent Diversity, Vietnamese, Audio, Speech Recognition

by NhutP·Updated 1y ago

Available on 1 platform

Description

Over 1,100 hours of Vietnamese speech data were collected from various social resources by author NhutP and last updated on April 25, 2025. The dataset includes a diverse representation of accents from northern, central, and southern Vietnam, as well as different dialects and speaking styles. This diversity is intended to enhance the training and evaluation of automatic speech recognition models.

Use Cases

Training Vietnamese ASR models based on the diverse accent and dialect representation.
Evaluating ASR model robustness based on the variety of speaking styles present.
Benchmarking speech recognition accuracy across different Vietnamese regional accents.
Fine-tuning pre-trained speech models for the Vietnamese language based on the social voice data.

Strengths

Over 1,100 hours of speech data provides substantial volume for model training.
Explicitly includes diverse accents (north, central, south), dialects, and speaking styles.

Limitations

Column-level documentation is absent; field semantics must be inferred after download.
Row count and file formats are unknown, which may limit suitability assessment.

Provenance

Source: NhutP on Hugging Face.
Collection Method: Collected from a variety of social resources.
Time Range: null
Freshness: Last updated 2025-04-25 08:24:55; freshness should be verified.
Geography: Vietnam, with representation from northern, central, and southern regions.

null

Audio Audio Data Accent Diversity Vietnamese Speech Recognition

Related Datasets

Quality Score

C43

Description

Source

Reputation

Quality Score

C43

Description

Source

Reputation

Access

Community

521 downloads

30 likes

0 views

Dataset Info

Author: NhutP
Created: Aug 28, 2024
Updated: Apr 25, 2025
Last synced: Jun 7, 2026

Access

Community

521 downloads

30 likes

0 views

Dataset Info

Author: NhutP
Created: Aug 28, 2024
Updated: Apr 25, 2025
Last synced: Jun 7, 2026

VietSpeech: Over 1,100 Hours of Vietnamese Social Voice Data

Description

Use Cases

Strengths

Limitations

Provenance

Related Topics

Related Datasets

Quality Score

Community

Dataset Info

Community

Dataset Info