5,358 audio tracks featuring synchronized lyrics and vocal notes across multiple languages and genres. The dataset provides hierarchical alignments at the paragraph, line, word, and character levels, paired with fundamental frequency (F0) information for the singing voice.
Use Cases
- Train automatic lyrics alignment systems using the word and character-level timestamps
- Develop singing voice transcription models using the vocal note and F0 data
- Create karaoke-style visualization tools using the hierarchical paragraph and line-level synchronization
Strengths
- 5,358 songs with time-aligned lyrics and vocal notes
- Hierarchical synchronization across four levels: paragraph, line, word, and character
- Includes fundamental frequency (F0) trajectories for the singing voice