Name: Pashto Speech Recognition Dataset with Domain-Specific Utterances
Creator: Sabtain-Dev
Published: 2026-06-05T14:03:40
Keywords: Domain Specific, Healthcare, Audio, Speech Recognition, Pashto Language, Audio Transcripts

Description

A domain-specific Pashto automatic speech recognition dataset covering agriculture, general topics, food services, health, and services. The dataset is structured by domain with audio files and corresponding transcript CSV files, created by Sabtain-Dev and last updated on June 5, 2026.

Use Cases

Train domain-specific Pashto speech recognition models based on the agriculture, health, and services audio domains.
Benchmark ASR model performance on Pashto across different conversational topics mentioned in the description.
Create synthetic Pashto speech data for underrepresented domains like food services.
Study acoustic characteristics of Pashto speech in professional and general contexts.

Strengths

Covers five distinct spoken domains: agriculture, general, food services, health, and services.
Provides a clear mapping structure where audio files correspond to rows in a transcript CSV.
Includes multiple audio formats, as indicated in the description.

Limitations

Description metadata is limited; actual data quality requires manual inspection after download.
Row count, column details, and license information are unknown, which may limit suitability assessment.
Column-level documentation is absent; field semantics must be inferred after download.

Provenance

Source: huggingface
Collection Method: Likely collected and curated for ASR model training.
Time Range: null
Freshness: Last updated 2026-06-05 15:08:42; freshness should be verified.
Geography: null

License is unknown; users must verify permissions before use.

Audio Domain Specific Healthcare Speech Recognition Pashto Language Audio Transcripts

Pashto Speech Recognition Dataset with Domain-Specific Utterances

Description

Use Cases

Strengths

Limitations

Provenance

Related Topics

Related Datasets

Quality Score

Community

Dataset Info

Community

Dataset Info