Hadrami Arabic Dialect with Modern Standard Arabic Equivalents, 1,047 Entries
Available on 1 platform
Sign in to view source links and access this dataset
Description
A collection of 1,047 entries from the Hadrami dialect of Arabic, each paired with a Modern Standard Arabic equivalent. The dataset includes examples and proverbs, sourced from the Hadramawt region. It was uploaded to Kaggle, but the author, organization, and specific creation details are unknown.
Use Cases
Train dialect-to-MSA machine translation models based on the parallel text pairs.
Analyze linguistic features and variation in the Hadrami dialect based on the provided examples.
Study cultural expressions and metaphors based on the collection of proverbs.
Build dialect identification or classification tools based on the dialectal text samples.
Strengths
Contains 1,047 specific dialect entries, providing a defined corpus size.
Includes parallel Modern Standard Arabic equivalents, enabling comparative analysis.
Features proverbs, adding a cultural and idiomatic dimension to the linguistic data.
Limitations
Column-level documentation is absent; field semantics must be inferred after download.
Row count is unknown for the main data structure, which may limit suitability assessment.
Last update date is unknown; freshness unverified.
Provenance
Source
Kaggle
Geography
Hadramawt region
License is unknown; users must verify terms of use before application.