Loading...
Loading...
Available on 1 platform
Sign in to view source links and access this dataset
UniWorld V1 provides between 1,000 and 10,000 image-text pairs sourced from the BLIP3o-60k collection, released by LanguageBind in June 2025. It utilizes Geneval-style annotations to facilitate the training of high-resolution semantic encoders for unified visual understanding and generation.
Users must download source images and annotation JSONs separately from the LanguageBind/UniWorld-V1 repository and construct a data.txt file following the specific format required by the authors.