Loading...
Loading...
Available on 1 platform
Sign in to view source links and access this dataset
558,000 image-text pairs form this dataset for vision-language instruction tuning, curated by the lmms-lab research group. It was last updated in May 2024 and is hosted on Hugging Face. The data is specifically designed for training and evaluating multimodal AI models that process both visual and textual information.
Platform tags indicate the data is stored in Parquet format, requiring compatible libraries like Polars, Dask, or Hugging Face Datasets for efficient loading.