12.8 million image URLs and their corresponding CLIP embeddings derived from the datacomp_small benchmark. The dataset is processed via the Fondant framework to provide a production-ready format for multimodal machine learning tasks without requiring raw image storage.
Use Cases
- Build image retrieval systems by indexing the CLIP embeddings for vector search.
- Analyze dataset distribution and identify outliers using the embedding vectors.
- Train lightweight linear probes for image classification using the pre-extracted CLIP features.
Strengths
- Contains 12.8 million rows of image URLs and high-dimensional CLIP embeddings.
- Based on the datacomp_small subset of the DataComp benchmark for multimodal learning.
- Processed and formatted using the Fondant framework for streamlined data engineering and sharing.