Sign in to view source links and access this dataset
Description
Hate speech detection data spanning two major languages, English and Spanish. The dataset is hosted on Kaggle, but its specific collection method, size, and annotation details are not provided in the available metadata. Researchers must download the dataset to inspect its volume, annotation schema, and source characteristics.
Use Cases
Training a binary classifier to detect hate speech in English text (inferred from domain, verify after download)
Fine-tuning a multilingual model for hate speech identification in Spanish (inferred from domain, verify after download)
Comparative analysis of hate speech patterns and lexicon across English and Spanish (inferred from domain, verify after download)
Strengths
Published on Kaggle, a platform with established data sharing and versioning tools.
Limitations
Metadata is minimal; actual content requires verification after download.
Column-level documentation is absent; field semantics must be inferred after download.
Row count is unknown, which may limit suitability assessment.
Provenance
Source
Kaggle
Collection Method
Collection method is unknown from the provided metadata.
Time Range
Temporal coverage is unknown from the provided metadata.
Freshness
Last update date is unknown; freshness unverified.
Geography
Spatial coverage is unknown from the provided metadata.
License is unknown; users must verify terms of use before application.