Sign in to view source links and access this dataset
Description
A 2025 collection of multiple audio datasets compiled by XiaomiMiMo for the MiMo-Audio-Eval toolkit. It includes datasets for automatic speech recognition, text-to-speech, and audio understanding tasks such as AISHELL1, LibriSpeech, and SeedTTS.
Use Cases
Benchmark automatic speech recognition models using the AISHELL1 and LibriSpeech datasets.
Evaluate text-to-speech synthesis systems with the SeedTTS dataset.
Test audio understanding and reasoning models on the MMAU and MMSU datasets.
Strengths
Compiled by XiaomiMiMo, a major technology organization.
Includes established benchmark datasets like AISHELL1 and LibriSpeech.
Updated in September 2025, indicating recent maintenance.
Limitations
Specific dataset sizes, row counts, and column structures are unknown.
The composition and balance of the included datasets are not detailed.
File formats and data accessibility details are unspecified.
Provenance
Source
XiaomiMiMo
Collection Method
Collection of existing audio datasets for an evaluation toolkit.
Freshness
Last updated on 2025-09-18.
Users should review the full description on the Hugging Face dataset page for details on included datasets and potential usage terms.