Loading...
Loading...
Available on 1 platform
Sign in to view source links and access this dataset
Corvus-OCR-Caption-Mix is a high-quality, compact multimodal dataset containing over 229,000 image-caption pairs. It was created by author zhangshengchoa and is derived from the larger BLIP3o/BLIP3o-Pretrain-Long-Caption dataset. The dataset was last updated on the platform in April 2026.
License information is unknown; users must verify licensing terms before use.