Loading...
Loading...
Available on 1 platform
Sign in to view source links and access this dataset
Comprising 1,235,432 Midjourney v6 images paired with captions generated by three different Vision Language Models (VLMs), released by Photoroom in March 2026. It provides a large-scale collection of AI-generated art with multi-perspective textual descriptions from LLaVA, Gemini Flash 1.5, and Qwen3 VL 8B. The data is formatted in Parquet for efficient processing in machine learning workflows.
The dataset is distributed in Parquet format and licensed under MIT. Users should account for the ~17% missing values in the Gemini and Qwen3 caption columns during preprocessing.