Agri STT Benchmarking Dataset: Multilingual Agricultural Speech for ASR

Name: Agri STT Benchmarking Dataset: Multilingual Agricultural Speech for ASR
Creator: DigiGreen
Published: 2026-02-05T09:33:23
Keywords: Indian Languages, Benchmarking, Benchmark, Multilingual, Agriculture, Audio, Speech Recognition

by DigiGreenUpdated 3mo ago

Available on 1 platform

Sign in to view source links and access this dataset

Description

A domain-specific, multilingual agricultural speech dataset with a primary focus on Hindi, Telugu, and Odia. It features human-annotated transcriptions and is intended for benchmarking ASR model performance in real-world agricultural scenarios, created by DigiGreen. The dataset page was last updated on 2026-04-15.

Use Cases

Benchmarking ASR model performance based on real-world agricultural audio
Training domain-specific speech-to-text models based on agricultural advisory content
Evaluating multilingual ASR systems based on the Hindi, Telugu, and Odia languages mentioned
Improving speech recognition accuracy in noisy, real-world field scenarios as described

Strengths

Focuses on three specific Indian languages: Hindi, Telugu, and Odia
Contains human-annotated transcriptions for accuracy
Designed for real-world agricultural advisory scenarios
Benchmarks 10 ASR models using 10,934 audio samples

Limitations

Column-level documentation is absent; field semantics must be inferred after download
Row count and total dataset size are unknown, which may limit suitability assessment
Freshness should be verified as the last update date is in the future (2026-04-15)

Provenance

Source: DigiGreen
Collection Method: Likely collected from real-world agricultural advisory interactions.
Freshness: Last updated 2026-04-15 10:00:22
Geography: Primarily India, based on the focus on Hindi, Telugu, and Odia languages.

License is unknown; check the dataset page for usage restrictions.

Audio Multilingual Indian Languages Benchmarking Benchmark Agriculture Speech Recognition

Related Datasets

Quality Score

D40

Description

42

Source

41

Reputation

42

Access

26

Community

81 downloads

1 likes

0 views

Dataset Info

Author: DigiGreen
Created: Feb 5, 2026
Updated: Apr 15, 2026
Last synced: May 25, 2026

Access

26

Community

81 downloads

1 likes

0 views

Dataset Info

Author: DigiGreen
Created: Feb 5, 2026
Updated: Apr 15, 2026
Last synced: May 25, 2026

Agri STT Benchmarking Dataset: Multilingual Agricultural Speech for ASR

Description

Use Cases

Strengths

Limitations

Provenance

Related Topics

Related Datasets

Quality Score

Community

Dataset Info

Community

Dataset Info