Name: TTS Pretrain Clones 3M: 3 Million Synthesized English Voice-Clone Utterances
Creator: SynDataLab
Published: 2026-04-24T14:51:53
Keywords: Text To Speech, Speech Synthesis, Voice Cloning, Audio, Audio Generation, Synthetic

Description

2,967,779 clone utterances across 2,971 English speakers, generated by the echo-tts synthesizer. The dataset was created by SynDataLab and last updated on 2026-04 25. It contains WAV audio at 44.1 kHz, stored in Parquet files, with each speaker represented by 10 voice-clone latents and 100 synthesized texts.

Use Cases

Pre-training text-to-speech models based on the large-scale, speaker-diverse synthetic audio.
Developing voice cloning systems based on the speaker latents and generated utterances.
Benchmarking speech synthesis quality based on the structured generation from 100 texts per latent.
Analyzing speaker identity consistency in synthetic speech based on the multiple latents per speaker.

Strengths

Contains 2,967,779 audio utterances, providing a large-scale resource.
Covers 2,971 distinct English speakers, offering significant speaker diversity.
Audio is high-fidelity with a 44.1 kHz sample rate.
Generation process is structured, with 10 latents and 100 texts per speaker.

Limitations

Column-level documentation is absent; field semantics must be inferred after download.
Data is entirely synthetic, which may not fully capture the nuances of natural human speech.
Description metadata indicates partial coverage for some speaker IDs.

Provenance

Source: SynDataLab via Hugging Face.
Collection Method: Generated by echo-tts synthesizing English text on speaker latents derived from Qwen3-TTS VoiceDesign base speakers.
Time Range: null
Freshness: Last updated 2026-04-25 17:29:36; freshness should be verified.
Geography: null

License is unknown, which may restrict usage. A companion 'refs set' contains the first utterance of each speaker.

Audio Text To Speech Speech Synthesis Voice Cloning Audio Generation Synthetic

TTS Pretrain Clones 3M: 3 Million Synthesized English Voice-Clone Utterances

Description

Use Cases

Strengths

Limitations

Provenance

Related Topics

Related Datasets

Quality Score

Community

Dataset Info

Community

Dataset Info