Loading...
Loading...
Available on 1 platform
Sign in to view source links and access this dataset
595,000 image-text pairs form a subset of the CC-3M dataset, filtered for balanced concept coverage. It was created by liuhaotian for the pretraining stage of visual instruction tuning, aiming to build large multimodal models. The dataset was last updated on July 6, 2023.
License is unknown, which may restrict usage.