DataSalon

Discover quality datasets for AI training — aggregated from 40+ platforms, curated by AI.

ProductSearch Datasets Browse Topics Rankings Community API / MCP

ResourcesDocumentation Blog Changelog Status

LegalPrivacy Policy Terms of Service Cookie Policy

EmalonSpeech V0.1: High-Fidelity Speech Data for Low-Resource Languages | DataSalon

Home Speech & AudioEmalonSpeech V0.1: High-Fidelity Speech Data for Low-Resource Languages

Speech & Audio

EmalonSpeech V0.1: High-Fidelity Speech Data for Low-Resource Languages

Name: EmalonSpeech V0.1: High-Fidelity Speech Data for Low-Resource Languages
Creator: DayanandaThokchom
Published: 2026-01-10T23:49:18
Keywords: Text To Speech, Audio, Single Speaker, Low Resource Language

by DayanandaThokchom·Updated 6mo ago

Available on 1 platform

Description

EmalonSpeech V0.1 is a high-fidelity, single-speaker speech dataset designed for low-resource languages. It was created by Dayananda Thokchom of YAAI DYNAMICS, with speaker Helly Maisnam, and was released on Hugging Face in January 2026. The dataset aims to address the gap in TTS resources for languages underrepresented in current research.

Use Cases

Train text-to-speech models based on high-fidelity audio recordings.
Benchmark TTS model performance for underrepresented languages.
Develop speech synthesis tools for specific low-resource language communities.
Create educational or accessibility applications using synthesized speech.

Strengths

Dataset is explicitly designed for high-fidelity audio.
Focuses on low-resource languages, a stated research gap.
Provides a single-speaker corpus, which can simplify model training.

Limitations

Column-level documentation is absent; field semantics must be inferred after download.
Row count is unknown, which may limit suitability assessment.
Description metadata is limited; actual data quality requires manual inspection after download.

Provenance

Source: Hugging Face, uploaded by DayanandaThokchom.
Freshness: Last updated 2026-01-11 00:21:30; freshness should be verified.

Audio Text To Speech Single Speaker Low Resource Language

Related Datasets

Quality Score

D37

Description

Source

Reputation

Quality Score

D37

Description

Source

Reputation

Access

Community

24 downloads

1 likes

0 views

Dataset Info

Author: DayanandaThokchom
Created: Jan 10, 2026
Updated: Jan 11, 2026
Last synced: May 13, 2026

Access

Community

24 downloads

1 likes

0 views

Dataset Info

Author: DayanandaThokchom
Created: Jan 10, 2026
Updated: Jan 11, 2026
Last synced: May 13, 2026

EmalonSpeech V0.1: High-Fidelity Speech Data for Low-Resource Languages

Description

Use Cases

Strengths

Limitations

Provenance

Related Topics

Related Datasets

Quality Score

Community

Dataset Info

Community

Dataset Info