Description

A 30-hour voice dataset recorded by an Irish speaker named Jenny. The dataset includes audio of newspaper headlines, YouTube video transcripts, sections from books '1984' and 'Little Women', Wikipedia articles, recipes, Reddit comments, song lyrics, and transcripts from the show 'Friends'. Audio files are 48kHz, 16-bit PCM format, and the dataset was last updated on HuggingFace in January 2024.

Use Cases

Train a text-to-speech model based on the high-quality, varied 30-hour audio collection.
Fine-tune voice synthesis for Irish accents based on the single-speaker recordings.
Generate synthetic speech for diverse content types based on the described source materials like books, articles, and scripts.
Benchmark TTS model performance on varied linguistic styles based on the mix of formal text, casual comments, and lyrics.

Strengths

Approximately 30 hours of high-quality audio, providing substantial material for model training.
Audio is recorded at a professional 48kHz, 16-bit PCM specification.
Content is varied, including books, articles, scripts, and social media text, which may improve model generalization.

Limitations

Column-level documentation is absent; field semantics must be inferred after download.
Row count, file formats, and license information are unknown, which may limit suitability assessment.
Data is from a single Irish speaker, which limits accent and voice diversity.

Provenance

Source: HuggingFace user 'reach-vb'.
Collection Method: Voice recordings by a single speaker (Jenny) reading from various text sources.
Time Range: null
Freshness: Last updated 2024-01-09 14:11:57; freshness should be verified.
Geography: Speaker is Irish, suggesting a potential geographic association with Ireland.

License is unknown; users must verify permissions before use.

Audio Text To Speech Voice Dataset Irish Accent Speech Training Audio Synthesis

Jenny TTS Dataset: 30 Hours of Irish-Accented Speech for Synthesis

Description

Use Cases

Strengths

Limitations

Provenance

Related Topics

Related Datasets

Quality Score

Community

Dataset Info

Community

Dataset Info