Loading...
Loading...
Available on 1 platform
Sign in to view source links and access this dataset
CommonCatalog CC-BY provides approximately 100 million high-resolution images paired with synthetic captions, released by common-canvas in 2024. The collection originates from Yahoo Flickr data from 2014 and features images with resolutions up to 4k.
The dataset is distributed in Parquet format and is compatible with Polars and Dask for large-scale processing.