Arabic Speech Corpus

Name: Arabic Speech Corpus
Creator: halabi2016
Published: 2022-03-02T23:29:22
Keywords: Source Datasetsoriginal, Size Categories1 Kn10 K, Languagear, Language Creatorscrowdsourced, Licensecc By 40, Regionus, Task Categoriesautomatic Speech Recognition, Multilingualitymonolingual, Annotations Creatorsexpert Generated

by halabi2016Updated 1y ago

Available on 1 platform

Sign in to view source links and access this dataset

Description

Assembled from high-quality audio recordings in the South Levantine Arabic dialect, specifically focusing on the Damascian accent. The corpus was recorded in a professional studio and is provided in .flac format to optimize storage while maintaining audio fidelity.

Use Cases

Develop high-quality Text-to-Speech (TTS) systems for the Damascian accent using the audio files referenced in the 'file' column.
Train speech recognition models to better handle South Levantine dialectal variations by processing the .flac audio samples.
Perform phonetic and linguistic analysis of Damascian Arabic by converting the 'file' paths into float32 speech arrays for signal processing.

Strengths

Recorded in a professional studio environment to ensure high-quality, natural voice output.
Focuses specifically on the South Levantine Arabic dialect with a Damascian accent.
Audio data is stored in .flac format to reduce storage requirements without loss of quality.
Includes a 'file' column containing paths to audio recordings for batch processing.

Source Datasetsoriginal Size Categories1 Kn10 K Languagear Language Creatorscrowdsourced Licensecc By 40 Regionus Task Categoriesautomatic Speech Recognition Multilingualitymonolingual Annotations Creatorsexpert Generated

Related Datasets

Quality Score

D35

Description

37

Source

36

Reputation

40

Access

22

Community

416 downloads

38 likes

0 views

Dataset Info

Author: halabi2016
Created: Mar 2, 2022
Updated: Aug 14, 2024
Last synced: Jun 8, 2026

Access

22

Community

416 downloads

38 likes

0 views

Dataset Info

Author: halabi2016
Created: Mar 2, 2022
Updated: Aug 14, 2024
Last synced: Jun 8, 2026

Arabic Speech Corpus

Description

Use Cases

Strengths

Related Topics

Related Datasets

Quality Score

Community

Dataset Info

Community

Dataset Info