Loading...
Loading...
Available on 1 platform
Sign in to view source links and access this dataset
MINT-1T contains 1 trillion text tokens and 3.4 billion images, scaling open-source multimodal data by a factor of ten. The dataset was created by a team from the University of Washington and released in 2024, incorporating sources like PDFs and arXiv papers to facilitate research in multimodal pretraining.
Dataset is very large (1T tokens, 3.4B images); ensure sufficient storage and bandwidth. License is indicated as CC BY 4.0 via platform tags.