Loading...
Loading...
Available on 1 platform
Sign in to view source links and access this dataset
somu9 provides 20,141 pre-extracted audio codec tokens for text-to-speech training, derived from the reach-vb/jenny_tts_dataset. The collection contains 26.4 hours of audio, tokenized using the MOSS-Audio-Tokenizer-Nano codec at 48 kHz stereo and a frame rate of 12.5 Hz.
License is unknown, which may restrict usage.