Name: CVQA: Culturally Diverse Multilingual Visual Question Answering Benchmark
Creator: afaji
Published: 2024-04-26T11:25:15
Keywords: Languagezh, Task Categoriesquestion Answering, Languageno, Languageid, Cultural Diversity, Languagesi, Benchmark, Multilingual, Languagega, Languageko, Languagesu, Languageja, Languagemin, Languagees, Languageta, Languagejv, Languagept, Languagemn, Languagero, Visual Question Answering, Multimodal, Languagems

Description

CVQA is a culturally diverse multilingual visual question answering benchmark consisting of over 10,000 questions from 39 country-language pairs. The dataset was constructed through a collaborative effort led by researchers from MBZUAI and is designed for use as a test set. It was last updated on November 27, 2024.

Use Cases

Benchmarking multilingual VQA model performance based on 39 country-language pairs.
Evaluating cultural bias and diversity in AI models based on culturally diverse questions.
Training or testing models on multimodal tasks involving images and text questions.
Analyzing model performance across different question categories mentioned in the description.

Strengths

Over 10,000 questions provide a substantial test set.
39 country-language pairs offer significant linguistic and cultural diversity.
Questions are categorized into 10 diverse categories for structured evaluation.

Limitations

Column-level documentation is absent; field semantics must be inferred after download.
Row count is unknown, which may limit suitability assessment.
The dataset is designed as a test set, which may limit its utility for training.

Provenance

Source: MBZUAI researchers
Collection Method: Constructed through a collaborative effort.
Freshness: Last updated 2024-11-27 17:42:19; freshness should be verified.
Geography: Covers 39 country-language pairs.

CVQA: Culturally Diverse Multilingual Visual Question Answering Benchmark

Description

Use Cases

Strengths

Limitations

Provenance

Related Topics

Related Datasets

Quality Score

Community

Dataset Info

Community

Dataset Info