DataSalon

Discover quality datasets for AI training — aggregated from 40+ platforms, curated by AI.

ProductSearch Datasets Browse Topics Rankings Community API / MCP

ResourcesDocumentation Blog Changelog Status

LegalPrivacy Policy Terms of Service Cookie Policy

Earnings22 Baseline 5 Gram: 119 Hours of Accented Earnings Calls | DataSalon

Home Speech & AudioEarnings22 Baseline 5 Gram: 119 Hours of Accented Earnings Calls

Speech & Audio

Earnings22 Baseline 5 Gram: 119 Hours of Accented Earnings Calls

Name: Earnings22 Baseline 5 Gram: 119 Hours of Accented Earnings Calls
Creator: anton-l
Published: 2022-09-17T15:31:55
Keywords: Benchmark, Text, Earnings Calls, Audio, Natural Language Processing, Accented Speech, Speech Recognition

by anton-l·Updated 3y ago

Available on 1 platform

Description

A 119-hour corpus of English-language earnings calls collected from global companies. The dataset was created by anton-l and uploaded to Hugging Face in October 2022. Its primary purpose is to serve as a benchmark for automatic speech recognition models on real-world accented speech.

Use Cases

Benchmark ASR model performance based on real-world accented speech.
Train speech recognition models on domain-specific financial audio.
Evaluate model robustness across different speaker accents mentioned in the description.
Study linguistic patterns and vocabulary in corporate earnings communications.

Strengths

Corpus size is explicitly stated as 119 hours.
Focus on real-world accented speech provides a specific challenge for ASR.
Source is clearly identified as earnings calls from global companies.

Limitations

Column-level documentation is absent; field semantics must be inferred after download.
Last updated 2022-10-17 18:35:04; freshness should be verified.
Row count and file formats are unknown, which may limit suitability assessment.

Provenance

Source: huggingface
Collection Method: Collection of earnings calls from global companies.
Geography: Global

Text Audio Benchmark Earnings Calls Natural Language Processing Accented Speech Speech Recognition

Related Datasets

Quality Score

D28

Description

Source

Reputation

Quality Score

D28

Description

Source

Reputation

Access

Community

154 downloads

2 likes

0 views

Dataset Info

Author: anton-l
Created: Sep 17, 2022
Updated: Oct 17, 2022
Last synced: May 22, 2026

Access

Community

154 downloads

2 likes

0 views

Dataset Info

Author: anton-l
Created: Sep 17, 2022
Updated: Oct 17, 2022
Last synced: May 22, 2026

Earnings22 Baseline 5 Gram: 119 Hours of Accented Earnings Calls

Description

Use Cases

Strengths

Limitations

Provenance

Related Topics

Related Datasets

Quality Score

Community

Dataset Info

Community

Dataset Info