Loading...
Loading...
Available on 1 platform
Sign in to view source links and access this dataset
Teochew-Wild is a dataset of 12,500 audio clips from 20 native Teochew speakers. It was created by 'panlr' from online sources like news, storytelling, and TV programs, with annotations for standard characters and pinyin. The dataset was last updated in April 2026.
A companion text processing tool 'pyPengIm' is mentioned for pinyin conversion and disambiguation, but its integration requirements are not specified. The license is unknown.