Urdu fake and true news dataset is a collection of news articles in the Urdu language labeled for authenticity. It likely contains text entries categorized as either fake or true news, sourced from Kaggle. The dataset's specific size, authorship, and update date are unknown.
Use Cases
- Train a binary classifier to detect fake news in Urdu (inferred from domain, verify after download)
- Benchmark NLP models for text classification in a non-English language (inferred from domain, verify after download)
- Analyze linguistic patterns distinguishing fabricated and factual news reports (inferred from domain, verify after download)
Strengths
- Published on Kaggle, a platform with an established community for data sharing.
- Focuses on the Urdu language, which is a less common target in misinformation datasets.
Limitations
- Metadata is minimal; actual content requires verification after download.
- Row count, column definitions, and license information are unknown.
- Data may reflect source bias inherent to Kaggle submissions.