Bolbosh: Kashmiri Text-to-Speech Corpus

Name: Bolbosh: Kashmiri Text-to-Speech Corpus
Creator: GAASH-Lab
Published: 2026-03-10T14:51:51
Keywords: Text To Speech, Indic Voices, Speech Corpus, Kashmiri Language, Multilingual, Audio, Natural Language Processing

by GAASH-LabUpdated 3mo ago

Available on 1 platform

Sign in to view source links and access this dataset

Description

A Text-to-Speech corpus for the Kashmiri language, derived from the IndicVoices-R and RASA speech datasets. It was created by GAASH-Lab and used to develop the Bolbosh neural TTS system, as documented in a 2026 paper.

Use Cases

Training Kashmiri speech synthesis models based on the described speech corpus
Benchmarking TTS systems for languages with specific orthographic challenges
Studying multilingual speech data integration based on the combination of IndicVoices-R and RASA sources
Developing open-source neural TTS systems for low-resource languages

Strengths

Derived from two established speech data sources: IndicVoices-R and RASA
Specifically curated for the Kashmiri language, a low-resource language
Used to develop a documented, open-source neural TTS system (Bolbosh)

Limitations

Column-level documentation is absent; field semantics must be inferred after download
Row count is unknown, which may limit suitability assessment
Description metadata is limited; actual data quality requires manual inspection after download

Provenance

Source: GAASH-Lab
Collection Method: Curated combination of Kashmiri speech data from IndicVoices-R and RASA datasets
Freshness: Last updated 2026-04-03 17:43:37; freshness should be verified
Geography: Kashmiri language region

Audio Multilingual Text To Speech Indic Voices Speech Corpus Kashmiri Language Natural Language Processing

Related Datasets

Quality Score

D38

Description

39

Source

39

Reputation

43

Access

26

Community

167 downloads

1 likes

0 views

Dataset Info

Author: GAASH-Lab
Created: Mar 10, 2026
Updated: Apr 3, 2026
Last synced: May 12, 2026

Access

26

Community

167 downloads

1 likes

0 views

Dataset Info

Author: GAASH-Lab
Created: Mar 10, 2026
Updated: Apr 3, 2026
Last synced: May 12, 2026

Bolbosh: Kashmiri Text-to-Speech Corpus

Description

Use Cases

Strengths

Limitations

Provenance

Related Topics

Related Datasets

Quality Score

Community

Dataset Info

Community

Dataset Info