Loading...
Loading...
Available on 1 platform
Sign in to view source links and access this dataset
Synthetic text data for classification tasks, generated using the Distilabel framework. The dataset includes a reproducible pipeline configuration file (pipeline.yaml). Specific row count, column count, and file size are not detailed in the provided input.
The dataset's primary content is a pipeline configuration file (pipeline.yaml) for reproducing the synthetic data; the actual data files and their structure are not described. License is tagged as 'apache 20' but not explicitly confirmed.