Loading...
Loading...
Available on 1 platform
Sign in to view source links and access this dataset
A collection of 5000 hours of Bengali speech audio for automatic speech recognition, aggregated from nine public sources including Common Voice and OpenSLR. The dataset, created by SKNahin and last updated in March 2024, includes a filtering column to identify higher-quality audio segments based on word error rate and word-per-second metrics.
License information is unknown; users must verify permissible use and attribution requirements for the aggregated sources.