Sign in to view source links and access this dataset
Description
A dataset of conversational speech audio paired with transcripts and prompts. It contains turn-based dialogue data with columns for conversation identifiers, speaker agents, text prompts, transcripts, and audio files. The dataset was uploaded by ShiniChien to Hugging Face and last updated on 2026-05-15.
Use Cases
Training text-to-speech models based on the audio and transcript columns.
Analyzing conversational dynamics based on turn indices and agent roles.
Evaluating speech synthesis quality using the specified TTS voice and audio duration metadata.
Building multimodal dialogue agents using the paired prompts, transcripts, and audio.
Strengths
Includes structured metadata such as conversation_id, turn_index, and agent for each dialogue turn.
Contains multimodal data with linked audio files (WAV format) and text transcripts.
Provides system prompts used to generate each speaker's turn.
Limitations
Description metadata is limited; actual data quality requires manual inspection after download.
Row count is unknown, which may limit suitability assessment.
Column-level documentation is absent; field semantics must be inferred after download.
Provenance
Source
huggingface
Collection Method
Likely generated using Gemini Live AI system based on prompt column description.
Freshness
Last updated 2026-05-15 09:43:25; freshness should be verified.
License is unknown; terms of use must be verified before application.