CommonVoice: A Crowdsourced Speech Recognition Dataset

Available on 1 platform

Sign in to view source links and access this dataset

Description

CommonVoice is a dataset hosted on Kaggle. The title suggests it is a speech and audio dataset, likely containing voice recordings. The specific content, size, and collection details are not provided in the available metadata.

Use Cases

Train an automatic speech recognition model (inferred from domain, verify after download)
Benchmark speaker identification algorithms (inferred from domain, verify after download)
Develop voice-activated applications (inferred from domain, verify after download)

Strengths

Published on Kaggle, a major platform for data science resources.

Limitations

Metadata is minimal; actual content requires verification after download.
Column-level documentation is absent; field semantics must be inferred after download.
Row count, file formats, and license are unknown, which may limit suitability assessment.

Tabular Audio Audio Data Voice Corpus Speech Recognition

Related Datasets

Quality Score

D15

Description

5

Source

17

Reputation

18

Access

31

Community

0 views

Dataset Info

Last synced: Jun 11, 2026

Access

31

Community

0 views

Dataset Info

Last synced: Jun 11, 2026

CommonVoice: A Crowdsourced Speech Recognition Dataset

Description

Use Cases

Strengths

Limitations

Related Topics

Related Datasets

Quality Score

Community

Dataset Info

Community

Dataset Info