Name: EdAcc: 40 Hours of English Conversations with Diverse Accents
Creator: edinburghcstr
Published: 2024-02-21T14:25:13
Keywords: Accent Diversity, English Language, Audio, Natural Language Processing, Conversational Speech, Speech Recognition

Description

EdAcc (The Edinburgh International Accents of English Corpus) is an automatic speech recognition dataset composed of 40 hours of English dyadic conversations. It was created by edinburghcstr and includes speakers with a diverse set of first and second-language English accents, along with linguistic background profiles. The dataset was last updated on February 22,我们发现了一个错误。

Use Cases

Benchmarking ASR model performance on diverse English accents based on the described dyadic conversations.
Training accent-robust speech recognition systems based on the wide range of first and second-language English varieties.
Analyzing the impact of speaker background on ASR accuracy based on the included linguistic profiles.
Studying conversational speech patterns across different English accents.

Strengths

40 hours of audio data provides a substantial corpus for analysis.
Includes a diverse set of English accents from both first and second-language speakers.
Contains linguistic background profiles for each speaker, adding contextual metadata.

Limitations

Column-level documentation is absent; field semantics must be inferred after download.
Row count and file formats are unknown, which may limit suitability assessment.
Data may reflect geographic or demographic bias inherent to the collection method.

Provenance

Source: edinburghcstr
Collection Method: Likely recorded dyadic conversations between speakers.
Time Range: null
Freshness: Last updated 2024-02 22 14:24:42.
Geography: null

null

Audio Accent Diversity English Language Natural Language Processing Conversational Speech Speech Recognition

EdAcc: 40 Hours of English Conversations with Diverse Accents

Description

Use Cases

Strengths

Limitations

Provenance

Related Topics

Related Datasets

Quality Score

Community

Dataset Info

Community

Dataset Info