Sign in to view source links and access this dataset
Description
Professor's Fongbe Speech Dataset is a unified, high-quality collection of Fongbe speech data curated to preserve the linguistic integrity of this tonal language. It acts as a complete, unsegmented, and tone-accurate assembly of the Fongbe Continuous Speech Recognition corpora, merging the foundational ALFFA Project data from 2016 with an expanded Zenodo release from 2022. The dataset was last updated on the Hugging Face platform in February 2026.
Use Cases
Train automatic speech recognition models based on the described continuous, tone-accurate Fongbe speech.
Conduct linguistic research on tonal languages based on the dataset's stated preservation of linguistic integrity.
Benchmark speech processing systems for low-resource languages based on the unified collection of Fongbe data.
Develop language preservation tools based on the high-quality, curated Fongbe audio corpus.
Strengths
Described as a unified, high-quality collection specifically curated for linguistic integrity.
Merges two established sources: the ALFFA Project (2016) and a Zenodo release (2022).
Explicitly designed to preserve tone accuracy for this tonal language.
Limitations
Column-level documentation is absent; field semantics must be inferred after download.
Row count, file formats, and license information are unknown, which may limit suitability assessment.
Data may reflect geographic or source bias inherent to the original ALFFA and Zenodo collections.
Provenance
Source
Merges data from the ALFFA Project (2016) and a Zenodo release (2022), hosted by 'Professor' on Hugging Face.
Collection Method
Curated assembly of existing Fongbe Continuous Speech Recognition corpora.
Freshness
Last updated 2026-02-15 00:16:05; freshness should be verified.
License is unknown; users must verify permissions before use.