Description

Featuring high-quality conversational audio samples for Automatic Speech Recognition tasks in Vietnamese, Korean, Arabic, and Filipino. It includes paired audio and transcripts of natural, non-scripted speech, featuring both single-speaker and dual-speaker interactions. Audio specifications include a sampling rate of 16 kHz to 24 kHz and a 16-bit bit depth.

Use Cases

Train an Automatic Speech Recognition model on paired audio and transcripts for the four supported languages.
Fine-tune a speech model on non-scripted conversational speech to improve performance on natural dialogue.
Analyze acoustic features and transcription accuracy differences between single-speaker and dual-speaker interactions.
Benchmark ASR system performance across the Vietnamese, Korean, Arabic, and Filipino languages using high-fidelity audio.

Strengths

Includes audio samples in four distinct languages: Vietnamese, Korean, Arabic, and Filipino.
Audio is high-fidelity with a sampling rate of 16 kHz to 24 kHz and a 16-bit bit depth.
Contains both single-speaker and dual-speaker conversational interactions.
Features natural, non-scripted conversational speech paired with transcripts.

Limitations

The total number of audio samples, dataset size, and specific row count are unknown.
Specific audio file formats, column structure, and detailed metadata are not provided.
Geographic origin and temporal coverage of the speech samples are unspecified.

Provenance

Source: humyn-labs on Hugging Face.
Collection Method: Curated collection of conversational audio samples.
Freshness: Last updated on 2026-03-13.

The full description and detailed specifications are available only on the linked Hugging Face dataset page. License information is unknown.

OPTIMIZED-PARQUET Parquet Librarypolars Languagear Librarydask Size Categoriesn1 K Modalitytext Multi Speaker Librarymlcroissant Task Categoriesaudio Classification Librarydatasets Licensecc By 40 Regionus Single Speaker Task Categoriesautomatic Speech Recognition Natural Speech Speech Recognition Languagevi

High-Fidelity Conversational Speech in Four Asian Languages

Description

Use Cases

Strengths

Limitations

Provenance

Related Topics

Related Datasets

Quality Score

Community

Dataset Info

Community

Dataset Info