Name: V-RAGBench: 2,100 Query-Evidence-Answer Triplets for Video Retrieval-Augmented Generation
Creator: DISLab
Published: 2026-06-15T04:35:30
Keywords: Egocentric Video, Nlp Evaluation, Multimodal Qa, Benchmark, Rag Benchmark, Video Retrieval, Multimodal

Description

V-RAGBench is a benchmark dataset containing 2,100 open-ended query, evidence chunk, and answer triplets designed for evaluating retrieval and generation in long-video retrieval-augmented generation (VideoRAG). It was created by DISLab and last updated on June 15, 2026. The triplets are built from hour-scale egocentric videos, with queries designed to be answerable only from a specific localized evidence chunk.

Use Cases

Benchmarking retrieval accuracy in long-video contexts based on localized evidence chunks.
Evaluating the faithfulness of generation models in VideoRAG systems based on query-answer pairs.
Training or fine-tuning models for open-ended question answering on egocentric video content.
Studying causal dependencies between retrieval and generation in multimodal RAG pipelines.

Strengths

Contains 2,100 structured triplets for systematic evaluation.
Queries are designed for causal dependency on retrieval, enabling decoupled evaluation of RAG components.
Built from hour-scale egocentric videos, providing a long-context multimodal challenge.

Limitations

Column-level documentation is absent; field semantics must be inferred after download.
Row count is known (2,100), but other scale details like file size and formats are unknown.
Description metadata is limited; actual data quality requires manual inspection after download.

Provenance

Source: DISLab, via Hugging Face.
Collection Method: Built from hour-scale egocentric videos.
Freshness: Last updated 2026-06-15 04:56:16; freshness should be verified.

License is unknown; terms of use must be verified before application.

Multimodal Egocentric Video Nlp Evaluation Multimodal Qa Benchmark Rag Benchmark Video Retrieval

V-RAGBench: 2,100 Query-Evidence-Answer Triplets for Video Retrieval-Augmented Generation

Description

Use Cases

Strengths

Limitations

Provenance

Related Topics

Related Datasets

Quality Score

Community

Dataset Info

Community

Dataset Info