MulTaBench likely contains text data for toxicity classification tasks. The dataset is published on Kaggle, but specific details about its size, origin, and creation date are unknown. Columns suggest it includes text samples and corresponding classification labels.
Use Cases
- Benchmark toxicity detection models (inferred from domain, verify after download)
- Train text classifiers for content moderation (inferred from domain, verify after download)
- Evaluate model performance across different toxicity categories (inferred from domain, verify after download)
Strengths
- Published on Kaggle, a platform for sharing machine learning datasets
Limitations
- Metadata is minimal; actual content requires verification after download
- Row count, column definitions, and sample data are unavailable
- License, author, organization, and last update date are unknown