Loading...
Loading...
Available on 1 platform
Sign in to view source links and access this dataset
A curated Russian speech dataset for advanced speech generative tasks. The corpus was filtered and annotated by the lab260 team at MTUCI using the BALALAIKA pipeline. It includes genres such as podcasts, public speech, YouTube content, audiobooks, phone calls, and TTS.
License is referenced as mpl-2.0 but details require checking the full description on the dataset page.