Loading...
Loading...
Available on 1 platform
Sign in to view source links and access this dataset
100 million Chinese image-text pairs form a subset of the Noah-Wukong multimodal dataset. The dataset was uploaded by author 'wanng' to Hugging Face and last updated on December 11, 2022. The text metadata for these pairs occupies approximately 16GB of space.
The download success rate for images is noted to be around 80%, and the full dataset of images is described as 'very, very large'.