Updated Hate-Speech Dataset for Text Classification
Available on 1 platform
Sign in to view source links and access this dataset
Description
Updated Hate-Speech Dataset is a text corpus likely containing social media posts or comments annotated for offensive language. The dataset is hosted on Kaggle, but its specific size, origin, and update details are not provided in the metadata. Columns and sample data are unknown, requiring verification after download to confirm content and structure.
Use Cases
Training a binary classifier to detect hateful content (inferred from domain, verify after download)
Fine-tuning a transformer model for toxic language identification (inferred from domain, verify after download)
Analyzing linguistic patterns and triggers in abusive online speech (inferred from domain, verify after download)
Strengths
Published on Kaggle, a platform with established data sharing and versioning infrastructure.
Limitations
Metadata is minimal; actual content requires verification after download.
Row count, column definitions, and sample data are unknown, which may limit suitability assessment.
Data may reflect temporal or platform bias inherent to its unspecified source collection.
Provenance
Source
Kaggle
Collection Method
Collection method is unknown.
Time Range
Temporal coverage is unknown.
Freshness
Last updated date is unknown; freshness unverified.
Geography
Spatial coverage is unknown.
License is unknown; users must verify terms before use.