Syrian conflict news articles labeled as fake or credible. The dataset was created by Fatima K. Abu Salem of the American University of Beirut. Credibility was determined by matching crowdsourced article information against the Syrian Violations Documentation Center database.
Use Cases
- Train binary classifiers for fake news detection based on article text and labels.
- Analyze linguistic patterns in credible versus fake news about the Syrian conflict.
- Benchmark NLP models on a domain-specific fake news detection task.
Strengths
- Articles are labeled with a binary credibility score (0 or 1).
- Ground truth for labeling was derived from the Syrian Violations Documentation Center (VDC).
- Information extraction for labeling was performed via a structured crowdsourcing process.
Limitations
- Row count and dataset size are unknown, which may limit suitability assessment.
- Column-level documentation is absent; field semantics must be inferred after download.
- Last update date is unknown; freshness unverified.
Provenance
- Source
- American University of Beirut
- Collection Method
- Articles were labeled via crowdsourcing on Figure Eight, then matched against the Syrian Violations Documentation Center database.
- Geography
- Syria