A dataset for detecting hate speech in the Bengali language, sourced from social media platforms. The dataset is hosted on Kaggle and likely contains text samples with classification labels. Its specific size, annotation methodology, and creation date are not detailed in the provided metadata.
Use Cases
- Train a hate speech detection model for Bengali text (inferred from domain, verify after download)
- Benchmark natural language understanding models on a low-resource language task (inferred from domain, verify after download)
- Analyze linguistic patterns and markers of abusive content in Bengali (inferred from domain, verify after download)
Strengths
- Published on Kaggle, a platform with established data sharing practices.
- Platform tags indicate a focus on hate speech and Bengali language, suggesting domain relevance.
Limitations
- Metadata is minimal; actual content requires verification after download.
- Row count, column definitions, and license information are unknown.
- Data may reflect the geographic and temporal biases inherent to its unspecified social media source.
Provenance
- Source
- Kaggle
- Collection Method
- Likely collected from social media platforms, but the specific gathering method is unknown.
- Time Range
- null
- Freshness
- Last updated date is unknown.
- Geography
- null