Name: ATCO2-ASR-ATCOSIM: Air Traffic Control Speech for Automatic Speech Recognition
Creator: jlvdoorn
Published: 2023-06-14T13:08:14
Keywords: Audio, Speech Recognition

Description

A combined dataset from the ATCO2-ASR and ATCOSIM collections, likely containing air traffic control speech audio. The dataset was created by author jlvdoorn and last updated on July 7, 2023. It is split into 80% training and 20% validation partitions, with some files containing additional metadata.

Use Cases

Train automatic speech recognition models based on air traffic control communication audio.
Validate ASR model performance on a held-out validation set based on the described 80/20 split.
Analyze specialized vocabulary and speech patterns in aviation contexts based on the described domain.
Develop noise-robust speech processing techniques based on the likely real-world radio transmission environment.

Strengths

Provides a defined training and validation split (80%/20%), which is useful for machine learning.
Combines two established sources (ATCO2-ASR and ATCOSIM), potentially increasing data diversity.
Some files include supplementary metadata in an 'info' file, which may add context.

Limitations

Description metadata is limited; actual data quality, size, and column structure require manual inspection after download.
Column-level documentation is absent; field semantics must be inferred after download.
Row count and total size are unknown, which may limit suitability assessment for large-scale projects.

Provenance

Source: Combination of ATCO2-ASR and ATCOSIM datasets.
Collection Method: Files were selected randomly to create the 80/20 train/validation split.
Time Range: null
Freshness: Last updated 2023-07-07 07:06:05; freshness should be verified.
Geography: null

License is unknown; users must verify permissions before use.

Audio Speech Recognition

ATCO2-ASR-ATCOSIM: Air Traffic Control Speech for Automatic Speech Recognition

Description

Use Cases

Strengths

Limitations

Provenance

Related Topics

Related Datasets

Quality Score

Community

Dataset Info

Community

Dataset Info