Loading...
Loading...
Available on 1 platform
Sign in to view source links and access this dataset
DanQing100M is a large-scale Chinese vision-language dataset containing 100 million image-text pairs, totaling 12 terabytes. It was created by researchers including Hengyu Shen, Tiancheng Gu, and others from DeepGlint-AI, using web data from 2024 to 2025. The dataset is intended for vision-language pre-training tasks.
License restrictions are unknown and must be verified before use.