DataSalon

Discover quality datasets for AI training — aggregated from 40+ platforms, curated by AI.

ProductSearch Datasets Browse Topics Rankings Community API / MCP

ResourcesDocumentation Blog Changelog Status

LegalPrivacy Policy Terms of Service Cookie Policy

Raw Emocean: 15-Hour English Speech Dataset for TTS Training | DataSalon

Home Speech & AudioRaw Emocean: 15-Hour English Speech Dataset for TTS Training

Speech & Audio

Raw Emocean: 15-Hour English Speech Dataset for TTS Training

Name: Raw Emocean: 15-Hour English Speech Dataset for TTS Training
Creator: somu9
Published: 2026-04-24T16:05:43
Keywords: Text To Speech, Machine Learning, Audio, Large Scale

by somu9·Updated 2mo ago

Available on 1 platform

Description

Raw Emocean is a large-scale English speech dataset designed for training autoregressive text-to-speech models. It contains 8,649 audio segments totaling 15.39 hours, sourced from 22 videos, with an average segment duration of 6.4 seconds. The dataset was created by author somu9 and last updated on Hugging Face in April 2026.

Use Cases

Training autoregressive text-to-speech models based on the dataset's stated purpose.
Evaluating speech synthesis quality based on the provided signal-to-noise ratio (SNR) metrics.
Benchmarking TTS model performance on a dataset with a defined duration range (3.0s–8.0s).

Strengths

Contains 8,649 audio segments with a total duration of 15.39 hours.
Provides detailed audio specifications including a sample rate of 24,000 Hz, 16-bit depth, and an average SNR of 49.1 dB.
Segments have a controlled duration range of 3.0 to 8.0 seconds, which may be suitable for consistent model input.

Limitations

Column-level documentation is absent; field semantics must be inferred after download.
The dataset's source videos and potential speaker diversity are not described, which may indicate bias.
Last updated 2026-04-24 16:07:20; freshness should be verified.

Provenance

Source: somu9
Collection Method: Likely extracted from 22 source videos.
Freshness: 2026-04-24 16:07:20

Audio Text To Speech Machine Learning Large Scale

Related Datasets

Quality Score

D38

Description

Source

Reputation

Quality Score

D38

Description

Source

Reputation

Access

Community

1 likes

0 views

Dataset Info

Author: somu9
Created: Apr 24, 2026
Updated: Apr 24, 2026
Last synced: May 1, 2026

Access

Community

1 likes

0 views

Dataset Info

Author: somu9
Created: Apr 24, 2026
Updated: Apr 24, 2026
Last synced: May 1, 2026

Raw Emocean: 15-Hour English Speech Dataset for TTS Training

Description

Use Cases

Strengths

Limitations

Provenance

Related Topics

Related Datasets

Quality Score

Community

Dataset Info

Community

Dataset Info