Loading...
Loading...
Available on 1 platform
Sign in to view source links and access this dataset
tw-instruct-500k-2511 is a 2025 November version of a synthetic dialogue dataset for training Taiwanese Mandarin conversational models. It combines reference-based and reference-free generation methods to produce instruction-response pairs aligned with Taiwanese context. The dataset was created by lianghsun and updated on HuggingFace in May 2026.
License is unknown; terms of use must be verified before application.