Loading...
Loading...
Available on 1 platform
Sign in to view source links and access this dataset
A multimodal dataset derived from the LLaVA-Instruct-150K source, containing synthetic annotations for tasks involving text, images, and speech. It is licensed under CC-BY-4.0 and was uploaded by author dreyn74. The dataset's size is indicated to be between 10,000 and 100,000 samples.
The full description and data structure are available only on the Hugging Face dataset page; users must inspect the page for complete details.