Loading...
Loading...
Available on 1 platform
Sign in to view source links and access this dataset
700 hours of Central Thai speech and 40 hours each for three other Thai dialects form this corpus. The dataset, created by CMKL, includes parallel sentences across dialects to support speech and translation research. It was last updated in September 2024.
Parts of the corpus are included in the ML-SUPERB benchmark; users should check for overlap. License is suggested as CC BY-SA 4.0 per platform tags but not confirmed in the provided description.