Loading...
Loading...
Available on 1 platform
Sign in to view source links and access this dataset
75,285 samples of images paired with multiple-choice question-answer items, forming a training dataset for the CapRL-3B image captioning model. The dataset was created by internlm and was last updated on April 16, 2026. It is designed for a two-stage training objective where caption quality is evaluated through the answerability of visual questions.
License is unknown, which may restrict commercial use or redistribution.