A collection of training and evaluation files derived from the MultiVENT 2.0 benchmark for text-to-video retrieval. The dataset provides structured query-video pairs within training_data.json designed to facilitate explicit reasoning over video content for relevance assessment.
Use Cases
- Train a video reranking model using the query-video pairs in training_data.json to improve retrieval precision
- Evaluate text-to-video retrieval systems by applying reasoning-based logic to the MultiVENT 2.0 benchmark subsets
- Develop explainable AI models that justify video relevance based on specific content features identified in the reasoning data
Strengths
- Derived from the MultiVENT 2.0 benchmark specifically for event-centric video retrieval
- Includes a training_data.json file containing structured examples for training reranking models
- Provides data for reasoning-based reranking where models must justify the relevance of a video to a text query