Loading...
Loading...
Available on 1 platform
Sign in to view source links and access this dataset
CoRal provides between 100,000 and 1,000,000 Danish audio recordings and transcriptions for Automatic Speech Recognition (ASR) tasks. Created by the CoRal-project and updated in early 2025, the collection includes both conversational and read-aloud speech samples across various dialects and age groups.
The dataset is licensed under OpenRAIL and stored in Parquet format, requiring compatible libraries like Polars, Dask, or Hugging Face Datasets for efficient processing.