Loading...
Loading...
Available on 1 platform
Sign in to view source links and access this dataset
An open-source collection of 1,093 fully diacritized Arabic speech recordings, crowd-sourced from native speakers via Nahw.ai. The dataset contains audio recordings resampled to 16 kHz paired with their fully diacritized transcriptions. It was created by NahwAI and last updated on 2026-04-21.
License is listed as CC-BY-4.0 in the summary table, but the input field states 'unknown'; users should verify the license terms on the dataset page.