Sign in to view source links and access this dataset
Description
Open-LLaVA-NeXT 1M is a 1 million sample dataset for supervised fine-tuning, created to reproduce the LLaVA-NeXT model series. The author augmented the sharegpt4v_mix665k dataset and attempted to align with LLaVA-NeXT's training data, substituting inaccessible user interaction data with 200K samples from ALLaVA-Instruct-VFLAN-4V. This dataset was uploaded to Hugging Face by Lin-Chen on October 25, 2024.
Use Cases
Supervised fine-tuning of vision-language models based on the described 1M instruction-following examples.
Training or benchmarking multimodal assistants using the augmented data mix mentioned in the description.
Studying the impact of different data sources (like ShareGPT4V and ALLaVA-Instruct) on model performance.
Strengths
Contains approximately 1 million samples for instruction tuning.
Explicitly designed to align with the training data of the LLaVA-NeXT model series.
Augments the sharegpt4v_mix665k dataset with additional curated data sources.
Limitations
Description metadata is limited; actual data quality requires manual inspection after download.
Column-level documentation is absent; field semantics must be inferred after download.
The creator notes they were unable to access tens of thousands of real user interaction data used by the original LLaVA-NeXT.
Provenance
Source
Hugging Face, uploaded by author Lin-Chen.
Collection Method
Augmentation of the sharegpt4v_mix665k dataset, with substitution of some data sources.
Time Range
null
Freshness
Last updated 2024-10-25 10:50:53; freshness should be verified.
Geography
null
License is unknown; users must verify terms of use before downloading.