AlaminI created a bilingual Hausa-English STEM reasoning dataset containing 2,640 high-quality question-answer pairs. The dataset was translated from the STEM-Reasoning-Complex dataset using a culturally-embedded framework called 'Shehin Malamin Kimiyya'. It was last updated on March 7, 2026.
Use Cases
- Train or evaluate multilingual question-answering models based on bilingual STEM content.
- Study cultural adaptation in AI systems based on the 'Shehin Malamin Kimiyya' translation framework.
- Benchmark STEM reasoning capabilities in underrepresented languages like Hausa.
- Develop educational tools for Hausa-speaking learners based on culturally-embedded scientific explanations.
Strengths
- Contains 2,640 high-quality question-answer pairs.
- Uses a systematic translation framework for deep cultural adaptation.
- Is described as the first large-scale bilingual Hausa-English STEM reasoning dataset.
Limitations
- Column-level documentation is absent; field semantics must be inferred after download.
- Row count is unknown, which may limit suitability assessment.
- Description metadata is limited; actual data quality requires manual inspection after download.
Provenance
- Source
- huggingface
- Collection Method
- Translated from the STEM-Reasoning-Complex dataset using a systematic cultural adaptation framework.
- Freshness
- Last updated 2026-03-07 10:11:22; freshness should be verified.