Cambrian Vision-Centric Benchmark (CV-Bench) is a dataset introduced in the Cambrian-1 research paper for evaluating vision-centric multimodal large language models. The dataset contains annotations and images pre-loaded for processing with Hugging Face Datasets. It was created by nyu-visionx and last updated on July 20, 2025.
Use Cases
- Benchmarking multimodal LLM performance based on vision-centric tasks.
- Evaluating model accuracy on combined image and text understanding.
- Testing the capabilities of vision-centric AI systems.
- Comparing different multimodal LLM architectures on a standardized test set.
Strengths
- Dataset is designed specifically for benchmarking vision-centric multimodal LLMs.
- Files are pre-loaded for processing with Hugging Face Datasets, suggesting ease of integration.
- Last updated on July 20, 2025, indicating recent maintenance.
Limitations
- Description metadata is limited; actual data quality requires manual inspection after download.
- Column-level documentation is absent; field semantics must be inferred after download.
- Row count is unknown, which may limit suitability assessment.
Provenance
- Source
- nyu-visionx
- Collection Method
- Introduced in the Cambrian-1 research paper.
- Freshness
- Last updated 2025-07-20 20:16:35.