25,706 simulated patient-physician conversations in English, focusing on respiratory exams, with audio provided as 16 kHz WAV files. The dataset was created by WhissleAI for training automatic speech recognition models and was last updated on June 1, 2026. It includes annotations for speaker changes, emotions, intents, and roles.
Use Cases
- Training ASR models to transcribe medical conversations based on the described audio and annotations.
- Developing models to identify speaker roles (patient/physician) based on the annotated metadata.
- Analyzing emotional and intent patterns in clinical dialogue based on the described speech annotations.
- Creating synthetic training data for healthcare NLP systems based on the simulated interview structure.
Strengths
- 25,706 examples provide a substantial corpus for model training.
- Includes multiple annotation layers such as speaker changes, emotions, intents, and roles.
- Audio is provided in a standard 16 kHz WAV format.
Limitations
- Column-level documentation is absent; field semantics must be inferred after download.
- Data is simulated, which may not fully capture the nuances of real-world clinical interactions.
- The description metadata is limited; actual data quality requires manual inspection after download.
Provenance
- Source
- WhissleAI via Hugging Face.
- Collection Method
- Simulated medical interviews with a respiratory focus.
- Time Range
- null
- Freshness
- Last updated 2026-06-01 14:33:41; freshness should be verified.
- Geography
- null