Loading...
Loading...
Available on 1 platform
Sign in to view source links and access this dataset
A collection of 21,930,344 synthetic English captions for 10,965,172 images from the conceptual_12m dataset. The captions were generated using the llama3-llava-next-8b model, followed by cleanup and shortening with Meta-Llama-3-8B. The dataset was created by CaptionEmporium and last updated on Hugging Face in June 2024.
License is unknown; users must verify terms of use before downloading.