Name: ViSpeR: Multilingual Audio-Visual Speech Recognition Dataset
Creator: tiiuae
Published: 2024-05-07T11:58:57
Keywords: Lip Reading, Audio Visual, Computer Vision, Multilingual, Audio, Large Scale, Speech Recognition, Multimodal

Description

ViSpeR is a large-scale dataset for Visual Speech Recognition (VSR) covering four widely spoken languages: Arabic, Chinese, French, and Spanish. It was created to address the scarcity of publicly available VSR data for non-English languages and is described as larger in size compared to other datasets in its domain. The dataset and models are hosted by the author 'tiiuae' and were last updated on April 17,我们发现一个错误，请关闭当前工具，通过描述错误来反馈给我们。

Use Cases

Train visual speech recognition models based on the described multilingual audio-visual data.
Benchmark lip-reading performance across languages based on the dataset's coverage of Arabic, Chinese, French, and Spanish.
Develop multimodal AI systems that integrate visual cues with audio for speech understanding.

Strengths

Designed to address the scarcity of publicly available VSR data for non-English languages.
Covers four of the most widely spoken languages: Arabic, Chinese, French, and Spanish.
Reported to be larger in size compared to other datasets that cover similar languages.

Limitations

Column-level documentation is absent; field semantics must be inferred after download.
Row count, file formats, and exact size are unknown, which may limit suitability assessment.
License information is unknown, which may restrict usage.

Provenance

Source: tiiuae
Collection Method: Collected to address scarcity of public VSR data for non-English languages.
Time Range: null
Freshness: Last updated 2025-04-17 08:59:22; freshness should be verified.
Geography: Languages covered: Arabic, Chinese, French, Spanish.

License is unknown, which may impose restrictions on commercial or research use.

Audio Multimodal Multilingual Lip Reading Audio Visual Computer Vision Large Scale Speech Recognition

ViSpeR: Multilingual Audio-Visual Speech Recognition Dataset

Description

Use Cases

Strengths

Limitations

Provenance

Related Topics

Related Datasets

Quality Score

Community

Dataset Info

Community

Dataset Info