Sign in to view source links and access this dataset
Description
A resampled subset of the ICASSP 2022 DNS Challenge dataset, containing clean speech, environmental noise, and room impulse responses. All audio files are resampled from 48kHz to 16kHz and stored in lossless FLAC format, packed into tar shards. The dataset was created by user 'richiejp' and was last updated on March 22, 2026.
Use Cases
Training speech denoising models based on the provided clean speech and environmental noise files.
Simulating realistic room acoustics for audio processing based on the included impulse responses.
Benchmarking audio enhancement algorithms against the ICASSP 2022 DNS Challenge standard.
Preparing training data for machine learning models that require 16kHz audio input.
Strengths
Audio is provided in a lossless FLAC format, preserving original quality after resampling.
Data is structured into separate shards for clean speech, noise, and impulse responses, facilitating organized access.
The dataset is derived from the established ICASSP 2022 DNS Challenge, suggesting a standard benchmark use.
Limitations
The total number of audio files, their duration, and the dataset's total size are unknown from the provided metadata.
Column-level documentation and sample data are unavailable, requiring manual inspection after download to understand file attributes.
The license for use and redistribution is not specified in the provided input.
Provenance
Source
ICASSP 2022 DNS Challenge dataset
Collection Method
Resampled subset; audio files converted from 48kHz to 16kHz and packed into tar shards.
Freshness
Last updated 2026-03-22 15:59:00
Data is packaged in multiple tar shards which must be extracted; the license terms are unknown and should be verified before use.