Loading...
Loading...
Available on 1 platform
Sign in to view source links and access this dataset
Ming030890's dataset contains Cantonese audio-caption pairs sourced from YouTube videos with manually provided captions. It was built by re-transcribing audio with SenseVoice and filtering segments to create a collection supporting ASR development. The dataset includes segments where ASR output matches original captions and segments with homophone or English word differences.
License information is unknown. The full description is hosted externally on Hugging Face, requiring a visit to the dataset page for complete details.