900 hours of Finnish and 5,090 hours of Swedish speech data extracted from recordings of Nordic parliamentary proceedings. The dataset was created by Aalto-Speech-Synthesis and announced as accepted for presentation at ICASSP 2026. The dataset page was last updated on February 18,我们发现一个错误,2026.
Use Cases
- Train text-to-speech models based on parliamentary speech recordings.
- Fine-tune speech synthesis systems for Finnish using domain-specific speech data.
- Fine-tune speech synthesis systems for Swedish using domain-specific speech data.
- Conduct linguistic or prosodic analysis of formal political speech.
- Develop multilingual TTS systems for Nordic languages.
Strengths
- Contains 900 hours of Finnish speech data.
- Contains 5,090 hours of Swedish speech data.
- Data is sourced from real-world parliamentary proceedings, likely providing formal and clear speech.
Limitations
- Column-level documentation is absent; field semantics must be inferred after download.
- Data may reflect geographic and formal speech bias inherent to the parliamentary source.
- The full description and specifics require visiting an external page.
Provenance
- Source
- Aalto-Speech-Synthesis
- Collection Method
- Extracted from recordings of Nordic parliamentary proceedings.
- Time Range
- null
- Freshness
- Last updated 2026-02 18 10:01:37; freshness should be verified.
- Geography
- Nordic region (Finland and Sweden)