Magpie Speech Orpheus: 125K Synthetic Speech Samples

Name: Magpie Speech Orpheus: 125K Synthetic Speech Samples
Creator: Aratako
Published: 2025-08-12T15:58:05
Keywords: Text To Speech, Audio, Synthetic Speech, Audio Generation, Instruction Synthesis, Synthetic

by AratakoUpdated 9mo ago

Available on 1 platform

Sign in to view source links and access this dataset

Description

Magpie-Speech-Orpheus-125k is a synthetic speech dataset containing approximately 125,000 samples. It was created by applying the Magpie instruction-synthesis approach to the Orpheus-TTS text-to-speech model and decoding audio tokens with the SNAC 24 kHz codec. The dataset was authored by Aratako and last updated on August 26,我们发现了一个错误。

Use Cases

Training or fine-tuning text-to-speech models based on the synthetic audio samples.
Benchmarking speech generation quality based on the described Magpie instruction-synthesis method.
Researching audio token decoding techniques based on the use of the SNAC 24 kHz codec.
Studying the characteristics of synthetic speech data generated by large language model-based TTS systems.

Strengths

Contains approximately 125,000 synthetic speech samples.
Audio was generated using a specified method involving the Orpheus-TTS model and SNAC codec.

Limitations

Column-level documentation is absent; field semantics must be inferred after download.
Row count is unknown, which may limit suitability assessment.
The dataset is entirely synthetic, which may limit real-world applicability.

Provenance

Source: Aratako via Hugging Face.
Collection Method: Synthetic generation using the Magpie instruction-synthesis approach applied to the Orpheus-TTS model.
Time Range: null
Freshness: Last updated 2025-08-26 02:06:09; freshness should be verified.
Geography: null

null

Audio Text To Speech Synthetic Speech Audio Generation Instruction Synthesis Synthetic

Related Datasets

Quality Score

D39

Description

42

Source

36

Reputation

50

Access

26

Community

479 downloads

10 likes

0 views

Dataset Info

Author: Aratako
Created: Aug 12, 2025
Updated: Aug 26, 2025
Last synced: Apr 18, 2026

Access

26

Community

479 downloads

10 likes

0 views

Dataset Info

Author: Aratako
Created: Aug 12, 2025
Updated: Aug 26, 2025
Last synced: Apr 18, 2026

Magpie Speech Orpheus: 125K Synthetic Speech Samples

Description

Use Cases

Strengths

Limitations

Provenance

Related Topics

Related Datasets

Quality Score

Community

Dataset Info

Community

Dataset Info