DataSalon

Discover quality datasets for AI training — aggregated from 40+ platforms, curated by AI.

ProductSearch Datasets Browse Topics Rankings Community API / MCP

ResourcesDocumentation Blog Changelog Status

LegalPrivacy Policy Terms of Service Cookie Policy

Llava En Zh 300K: 300,000 Multimodal Instruction Examples | DataSalon

Home Multimodal & LLMLlava En Zh 300K: 300,000 Multimodal Instruction Examples

Multimodal & LLM

Llava En Zh 300K: 300,000 Multimodal Instruction Examples

Name: Llava En Zh 300K: 300,000 Multimodal Instruction Examples
Creator: BUAADreamer
Published: 2024-04-26T11:37:11
Keywords: Vision Language, Llava, Multimodal Instruction, Multimodal

by BUAADreamer·Updated 1y ago

Available on 1 platform

Description

300,000 examples of visual instruction data for training multimodal large language models. The dataset combines 150,000 English examples from the LLaVA project and 150,000 from the openbmb project. Author BUAADreamer uploaded this collection to Hugging Face on September 2, 2024.

Use Cases

Fine-tuning vision-language models based on the described multimodal instruction examples.
Training models for visual question answering based on the instruction data.
Benchmarking model performance on multimodal instruction-following tasks.
Conducting research on instruction tuning for multimodal AI systems.

Strengths

Contains 300,000 total examples, providing a substantial volume of training data.
Combines data from two established sources: LLaVA and openbmb.
Specifically formatted for use with the LLaMA Factory training toolkit.

Limitations

Column-level documentation is absent; field semantics must be inferred after download.
Row count is unknown, which may limit suitability assessment.
Data may reflect source bias inherent to the contributing projects LLaVA and openbmb.

Provenance

Source: Combined from the LLaVA and openbmb projects.
Collection Method: Likely curated and aggregated from existing visual instruction datasets.
Time Range: null
Freshness: Last updated 2024-09-02 14:20:59; freshness should be verified.
Geography: null

License is unknown; restrictions must be verified before use.

Multimodal Vision Language Llava Multimodal Instruction

Related Datasets

Quality Score

D33

Description

Source

Reputation

Quality Score

D33

Description

Source

Reputation

Access

Community

1.5K downloads

35 likes

0 views

Dataset Info

Author: BUAADreamer
Created: Apr 26, 2024
Updated: Sep 2, 2024
Last synced: Apr 27, 2026

Access

Community

1.5K downloads

35 likes

0 views

Dataset Info

Author: BUAADreamer
Created: Apr 26, 2024
Updated: Sep 2, 2024
Last synced: Apr 27, 2026

Llava En Zh 300K: 300,000 Multimodal Instruction Examples

Description

Use Cases

Strengths

Limitations

Provenance

Related Topics

Related Datasets

Quality Score

Community

Dataset Info

Community

Dataset Info