Loading...
Loading...
Available on 1 platform
Sign in to view source links and access this dataset
Hugging Face released Chug in April 2024 to provide sharded dataset loaders and decoders for multi-modal document, image, and text data. It focuses on efficient distributed training using WebDataset and PDF formats for computer vision and document understanding tasks.
Requires familiarity with WebDataset format and distributed training workflows; released under Apache-2.0 license.