LongVQUBench is a benchmark for evaluating long-term video quality understanding in large vision-language models. It features 1,200 videos and 1,500 QA pairs across three hierarchical evaluation levels. The dataset was created by Aarna004 and was last updated on 2026-06-23.
Use Cases
- Benchmarking model performance on local event quality understanding based on the LQU level mentioned in the description.
- Evaluating cross-event reasoning capabilities based on the CQR level mentioned in the description.
- Training or fine-tuning models for hierarchical video question-answering tasks based on the three-level structure.
Strengths
- Contains 1,200 videos, providing a substantial testbed for evaluation.
- Includes 1,500 QA pairs, offering a structured set of questions and answers.
- Organized across three hierarchical evaluation levels, enabling multi-faceted assessment.
Limitations
- Column-level documentation is absent; field semantics must be inferred after download.
- Row count is unknown, which may limit suitability assessment.
- Description metadata is limited; actual data quality requires manual inspection after download.
Provenance
- Source
- huggingface
- Freshness
- Last updated 2026-06-23 06:18:38; freshness should be verified.