Arabic EXAMS-Redux: Corrected High School Exam Benchmark

Name: Arabic EXAMS-Redux: Corrected High School Exam Benchmark
Creator: inceptlabs
Published: 2026-04-30T16:21:18
Keywords: Text Repair, Benchmark, Education, Text, Multilingual, Exams

by inceptlabsUpdated 1mo ago

Available on 1 platform

Sign in to view source links and access this dataset

Description

A corrected version of the Arabic subset from the EXAMS multilingual high-school benchmark addresses widespread text corruption. The dataset repairs issues like split diacritics, fragmented words, and non-Arabic glyphs found in the original PDF extraction. It was created by inceptlabs and last updated on 2026-05-04.

Use Cases

Benchmarking Arabic NLP models based on high-school exam questions.
Training models for educational question-answering using Modern Standard Arabic text.
Evaluating text repair and normalization techniques on corrupted Arabic text.
Studying the performance of multilingual models on Arabic-specific academic content.

Strengths

Specifically addresses and repairs corrupted Arabic text from PDF extraction.
Focuses on Modern Standard Arabic within a multilingual high-school exam benchmark.
Last updated on 2026-05-04, suggesting recent maintenance.

Limitations

Column-level documentation is absent; field semantics must be inferred after download.
Row count is unknown, which may limit suitability assessment.
Data may reflect geographic or source bias inherent to the original EXAMS benchmark.

Provenance

Source: inceptlabs
Collection Method: Corrected and text-repaired version of the OALL/Arabic_EXAMS dataset.
Freshness: Last updated 2026-05-04 12:54:23

License is unknown; users should verify terms before use.

Text Multilingual Text Repair Benchmark Education Exams

Related Datasets

Quality Score

D37

Description

42

Source

36

Reputation

39

Access

26

Community

10 downloads

1 likes

0 views

Dataset Info

Author: inceptlabs
Created: Apr 30, 2026
Updated: May 4, 2026
Last synced: May 18, 2026

Access

26

Community

10 downloads

1 likes

0 views

Dataset Info

Author: inceptlabs
Created: Apr 30, 2026
Updated: May 4, 2026
Last synced: May 18, 2026

Arabic EXAMS-Redux: Corrected High School Exam Benchmark

Description

Use Cases

Strengths

Limitations

Provenance

Related Topics

Related Datasets

Quality Score

Community

Dataset Info

Community

Dataset Info