A text dataset containing Bengali language content, likely annotated for hate speech detection. It is hosted on the Kaggle platform. The dataset's author, size, and specific annotation schema are not provided in the available metadata.
Use Cases
- Training a binary classifier to detect hate speech in Bengali text (inferred from domain, verify after download)
- Benchmarking multilingual hate speech detection models (inferred from domain, verify after download)
- Analyzing linguistic features of abusive language in Bengali (inferred from domain, verify after download)
Strengths
- Published on Kaggle, a platform with an active community for data sharing and discussion.
Limitations
- Metadata is minimal; actual content requires verification after download.
- Column-level documentation is absent; field semantics must be inferred after download.
- Row count is unknown, which may limit suitability assessment.