Loading...
Loading...
Available on 1 platform
Sign in to view source links and access this dataset
Mozilla Common Voice Corpus 22.0 is a multilingual speech dataset featuring audio recordings and text transcriptions across a wide array of global languages. This version is an unofficial conversion of the Mozilla project data provided by fsicoli and updated in August 2025. It includes data for dozens of languages including Arabic, Bengali, and Chinese.
This is an unofficial version of the Mozilla Common Voice Corpus 22.0; users should verify the license and data integrity against the official Mozilla release before use.