Loading...
Loading...
Available on 1 platform
Sign in to view source links and access this dataset
Presenting a reformatted version of theblackcat102/llava-instruct-mix, prepared for Vision Supervised Fine-Tuning (VSFT) with the TRL SFT Trainer. It is designed for instruction tuning of multimodal vision-language models. The dataset's author is HuggingFaceH4, and it was last updated in April 2024.
Users should be familiar with the TRL library and the specific VSFT script (vsft_llava.py) referenced in the description to utilize this dataset effectively. The license and exact file formats are unknown.