Loading...
Loading...
Available on 1 platform
Sign in to view source links and access this dataset
Pile Toxicity Balanced2 is a text dataset designed for training and evaluating language models on toxic content. The dataset, created by researcher tomekkorbak, was uploaded to Hugging Face in April 2022. It is part of a series of datasets derived from The Pile, a large-scale text corpus used for AI development.
License terms for derived use are not explicitly stated and depend on the original licenses within The Pile. Users must verify compliance for their intended application.