Sign in to view source links and access this dataset
Description
DailyTalkEdit provides paired original and modified audio files from dialogues, with annotations for modified time ranges and semantic influence. The dataset, created by wsntxxn, was last updated on Hugging Face in February 2026. It includes separate audio segments for modified utterances and structured metadata files for training, validation, and testing splits.
Use Cases
Training speech editing models based on paired original and modified audio files.
Developing audio forensics tools to detect modified time ranges indicated by the 'fake_region' annotations.
Analyzing the semantic influence of audio edits using the provided text descriptions.
Benchmarking models for dialogue audio manipulation and synthesis.
Strengths
Includes paired audio samples (original and modified) for direct comparison.
Provides semantic influence annotations describing the effect of edits.
Structured into standard train/val/test splits with JSONL metadata.
Limitations
Column-level documentation is absent; field semantics must be inferred after download.
Row count is unknown, which may limit suitability assessment.
Description metadata is limited; actual data quality requires manual inspection after download.
Provenance
Source
Hugging Face user wsntxxn
Freshness
Last updated 2026-02-04 14:24:22; freshness should be verified.
License is unknown; terms of use must be verified before application.