Dolly-Audio: 1,000 Hours of Multi-Speaker Vietnamese Speech

Name: Dolly-Audio: 1,000 Hours of Multi-Speaker Vietnamese Speech
Creator: vuhoanhuy
Published: 2025-12-24T15:31:09
Keywords: Text To Speech, Librarypolars, Librarydask, Modalityaudio, OPTIMIZED-PARQUET, Modalitytext, Size Categories100 Kn1 M, Librarymlcroissant, Vietnamese, Librarydatasets, Parquet, Audio, Regionus, Languagevi, Synthetic

by vuhoanhuyUpdated 7mo ago

Available on 1 platform

Sign in to view source links and access this dataset

Description

Dolly-Audio contains 1,000 hours of professionally cleaned Vietnamese speech audio featuring 152 speakers from various regions. Created by the Dolly AI Team and updated in December 2024, the corpus is designed to support speech synthesis and recognition research. It includes both audio recordings and corresponding text transcripts across multiple Vietnamese dialects.

Use Cases

Multi-speaker Text-to-Speech (TTS) synthesis using the audio modality and Vietnamese text transcripts
Automatic Speech Recognition (ASR) training across 152 unique speaker voices
Regional dialect classification using audio samples from different Vietnamese regions

Strengths

1,000 hours of audio recordings
152 unique speakers
Coverage of multiple Vietnamese regional dialects
Professionally cleaned audio quality

Limitations

Metadata tags indicate the presence of synthetic data which may affect naturalness in some applications
Lack of detailed column-level documentation for speaker metadata

Provenance

Source: Dolly AI Team (vuhoanhuy)
Collection Method: synthetic
Freshness: Last updated December 2024.
Geography: Vietnam

The dataset is distributed in optimized Parquet format; users should utilize libraries such as Hugging Face Datasets, Polars, or Dask for efficient data loading and processing.

Audio OPTIMIZED-PARQUET Parquet Text To Speech Librarypolars Librarydask Modalityaudio Modalitytext Size Categories100 Kn1 M Librarymlcroissant Vietnamese Librarydatasets Regionus Languagevi Synthetic

Related Datasets

Quality Score

D40

Description

48

Source

36

Reputation

44

Access

22

Community

404 downloads

1 likes

0 views

Dataset Info

Author: vuhoanhuy
Created: Dec 24, 2025
Updated: Dec 24, 2025
Last synced: Apr 20, 2026

Access

22

Community

404 downloads

1 likes

0 views

Dataset Info

Author: vuhoanhuy
Created: Dec 24, 2025
Updated: Dec 24, 2025
Last synced: Apr 20, 2026

Dolly-Audio: 1,000 Hours of Multi-Speaker Vietnamese Speech

Description

Use Cases

Strengths

Limitations

Provenance

Related Topics

Related Datasets

Quality Score

Community

Dataset Info

Community

Dataset Info