Loading...
Loading...
Available on 1 platform
Sign in to view source links and access this dataset
MMSU is a multimodal benchmark for spoken language understanding and reasoning featuring 47 sub-tasks across linguistic domains like phonetics and prosody. Created by ddwang2000 and documented in Arxiv 2506.04779, the collection contains between 1,000 and 10,000 records.
Data is stored in Parquet format and requires the datasets or polars library for efficient loading; see Arxiv 2506.04779 for task definitions.