Loading...
Loading...
Available on 1 platform
Sign in to view source links and access this dataset
Conceptual 12M contains 12 million image-text pairs intended for vision-and-language pre-training. It was created by Google Research using a relaxed version of the data collection pipeline from Conceptual Captions 3M.
License information is unknown; users should verify terms of use before downloading. The dataset is monolingual (English).