TAPS: Throat and Acoustic Paired Speech Dataset is a standardized corpus for deep learning-based speech enhancement, specifically targeting throat microphone recordings. The dataset provides paired recordings from 60 native Korean speakers, designed to address the high-frequency attenuation in throat mics caused by the low-pass filtering effect of skin and tissue.
Use Cases
- Train speech enhancement models using paired throat and acoustic microphone recordings to suppress background noise.
- Develop deep learning models to compensate for high-frequency attenuation in throat microphone signals.
- Research the low-pass filtering effect of skin and tissue on speech signals using the paired recording structure.
Strengths
- Contains paired recordings from 60 native Korean speakers, providing a substantial sample size for model training.
- Specifically designed for the research problem of throat microphone speech enhancement, offering a targeted data structure.
Limitations
- Dataset scope is limited to 60 native Korean speakers, which may limit generalizability to other languages or accents.
- The focus on throat microphones means the data is specialized and may not be directly applicable to other audio enhancement tasks.
Provenance
- Source
- yskim3271 on Hugging Face
- Collection Method
- null
- Time Range
- null
- Freshness
- null
- Geography
- South Korea