Loading...
Loading...
Available on 1 platform
Sign in to view source links and access this dataset
M3It provides between 1 million and 10 million bi-lingual instruction records for vision-language models, released by MMInstruction in 2023. It covers image classification and image-to-text tasks in both English and Chinese.
Refer to Arxiv paper 2306.04387 for detailed task descriptions and methodology; the license is listed as 'other' and may contain specific usage restrictions.