Loading...
Loading...
Available on 1 platform
Sign in to view source links and access this dataset
51,021 pre-computed latent representations for Urdu utterances, designed to bypass audio decoding during TTS model training. The latents are derived from the Humair332/Urdu-munch-1 audio source using the Aratako/Semantic-DACVAE-Japanese-32dim codec at a 25 Hz frame rate. Author zuhri025 uploaded this dataset to Hugging Face in April 2026.
License restrictions are unknown and should be verified before use.