Loading...
Loading...
Available on 1 platform
Sign in to view source links and access this dataset
An automated pipeline for collecting Egyptian Arabic text-audio pairs from YouTube videos. The dataset is created by OmarAhmedSobhy and was last updated on 2026-04-25. It uses forced alignment and automatic speech recognition models to process the audio and text.
License is unknown; source code for the collection pipeline is referenced but not included in the dataset listing.