MURAD is an open Arabic lexical dataset containing 95,000 word-definition pairs. It was created by riotu-lab and is designed to support research in computational linguistics and Arabic natural language processing. The dataset spans multiple scientific, religious, and linguistic domains.
Use Cases
- Train or evaluate Arabic language models based on the lexical word-definition pairs.
- Conduct lexicographic research based on the multi-domain vocabulary coverage.
- Support Arabic NLP tasks like definition modeling or dictionary expansion based on the structured word-definition data.
Strengths
- Contains 95,000 word-definition pairs, providing substantial lexical coverage.
- Spans multiple domains, including scientific, religious, and linguistic, suggesting broad vocabulary.
Limitations
- Column-level documentation is absent; field semantics must be inferred after download.
- Row count is unknown, which may limit suitability assessment.
Provenance
- Source
- riotu-lab
- Collection Method
- Likely compiled from existing lexical resources, but the specific method is not detailed.
- Time Range
- null
- Freshness
- Last updated 2026-06-07 10:48:07; freshness should be verified.
- Geography
- null