STOMA is a multi-speaker Greek speech corpus containing approximately 23 hours of studio-recorded read speech. It features audio from six native speakers (three male and three female), captured under controlled studio conditions to ensure high signal quality.
Use Cases
- Train a text-to-speech model on approximately 23 hours of studio-recorded Greek speech from six speakers.
- Develop multi-speaker voice synthesis systems using the three male and three female speaker recordings.
- Advance speech technology research for Greek, an under-resourced language, using high-quality studio recordings.
Strengths
- Approximately 23 hours of speech data provides a substantial audio corpus for model training.
- Six native speakers (three male, three female) offer gender-balanced multi-speaker coverage.
- Studio-recorded under controlled conditions with a dual-booth setup, ensuring high acoustic consistency and signal quality.
Limitations
- The corpus is limited to read speech, which may not capture spontaneous conversational speech patterns.
- With only six speakers, the dataset may have limited speaker diversity for certain multi-speaker modeling tasks.
- The focus is exclusively on Greek, limiting its applicability to other languages.
Provenance
- Source
- aangelakis on Hugging Face
- Collection Method
- Studio-recorded read speech captured under controlled conditions using a dual-booth setup.
- Time Range
- null
- Freshness
- null
- Geography
- Greece (Greek language)