Chinese multi-channel conversational speech data with expert human annotations, developed by ASLP@NPU and QualiaLabs. It is part of the SmoothConv–DuplexConv corpus family, constructed from the same underlying conversational sources as the companion DuplexConv dataset.
Use Cases
- Benchmarking speech recognition models based on high-fidelity human annotations
- Training speech enhancement systems based on multi-channel conversational audio
- Developing conversational AI agents based on annotated dialogue speech
- Studying speaker diarization based on multi-channel conversational sources
Strengths
- High-quality annotations provided by expert humans
- Part of a corpus family with a companion dataset (DuplexConv) of 2,000 hours
Limitations
- Description metadata is limited; actual data quality requires manual inspection after download
- Column-level documentation is absent; field semantics must be inferred after download
Provenance
- Source
- ASLP@NPU and QualiaLabs
- Collection Method
- Constructed from underlying conversational sources; expert human annotations.
- Freshness
- Last updated 2026-06-12 04:48:12; freshness should be verified
- Geography
- Chinese language