A dataset for bot detection on the Twitter platform, published in 2017. It is hosted on Kaggle, but its specific size, features, and collection methodology are not detailed in the provided metadata. The dataset's content and scale require verification after download.
Use Cases
- Training a classifier to distinguish between bot and human accounts (inferred from domain, verify after download)
- Benchmarking bot detection algorithms against a known corpus (inferred from domain, verify after download)
- Analyzing behavioral patterns of automated accounts on social media (inferred from domain, verify after download)
Strengths
- Published on Kaggle, a major platform for data science resources.
- Focuses on the well-documented problem of social media bot detection.
Limitations
- Metadata is minimal; actual content requires verification after download.
- Column-level documentation is absent; field semantics must be inferred after download.
- Row count is unknown, which may limit suitability assessment.