Sign in to view source links and access this dataset
Description
ArabicCulturalQA is a cross-dialectal Arabic cultural question-answering benchmark with parallel multiple-choice and open-ended formats. It covers Modern Standard Arabic, English, Egyptian, Levantine, Gulf, and Maghrebi dialects. The dataset accompanies an LREC 2026 paper and was created by QCRI.
Use Cases
Benchmarking Arabic language models on cultural knowledge based on the described multiple-choice question format.
Training models for open-ended cultural question answering based on the described open-ended question format.
Evaluating model performance across different Arabic dialects based on the parallel Modern Standard Arabic, Egyptian, Levantine, Gulf, and Maghrebi versions.
Studying cross-lingual transfer between English and Arabic for cultural QA tasks.
Strengths
The MCQ test set has been reviewed and post-edited by native speakers of each dialect.
Provides parallel data across six language/dialect variants: Modern Standard Arabic, English, Egyptian, Levantine, Gulf, and Maghrebi.
Includes both multiple-choice and open-ended question formats for comprehensive evaluation.
Limitations
Column-level documentation is absent; field semantics must be inferred after download.
Row count is unknown, which may limit suitability assessment.
Native-speaker post-editing of the open-ended question test set is described as ongoing.
Provenance
Source
QCRI
Collection Method
Likely created as a research benchmark for an LREC 2026 paper.
Freshness
Last updated 2026-05-10 09:27:05; freshness should be verified.
License is unknown; terms of use must be verified before application.