Sub Reverb Asr Dataset 0.4 contains 45 audio samples organized across three subsets. The subsets are 'original', 'pointsource_noises', and 'real_rirs_isotropic_noises', each with 15 samples in a 'train' split. The dataset was created by sujalappa and was last updated on HuggingFace in March 2026.
Use Cases
- Training ASR models on audio with simulated reverberation based on the 'original' subset.
- Testing ASR robustness against point-source noise interference based on the 'pointsource_noises' subset.
- Evaluating ASR performance with real room impulse responses and isotropic noise based on the 'real_rirs_isotropic_noises' subset.
- Benchmarking speech recognition algorithms under different acoustic distortion conditions.
Strengths
- Dataset is structured into three distinct subsets for controlled experimentation.
- Each subset contains 15 samples, providing a consistent base for comparison.
Limitations
- The total size is only 45 samples, which is a very small scale for machine learning training.
- Column-level documentation is absent; field semantics must be inferred after download.
- Description metadata is limited; actual data quality requires manual inspection after download.
Provenance
- Source
- sujalappa (via HuggingFace)
- Freshness
- Last updated 2026-03-05 05:25:38; freshness should be verified.