Loading...
Loading...
Available on 1 platform
Sign in to view source links and access this dataset
Over 1 million curated image-caption pairs were released by the Frontier Research Team at takara.ai in February 2025. The collection was produced by consolidating and standardizing multiple open-source datasets through a 96-hour computational validation process across three nodes.
The dataset is provided in Parquet format and includes synthetic data; it is optimized for use with the Hugging Face datasets library.