Loading...
Loading...
Available on 1 platform
Sign in to view source links and access this dataset
TVR provides video-subtitle pairs and natural language queries for temporal moment retrieval, introduced by Jie Lei at ECCV 2020. The collection focuses on the TV show domain, requiring models to utilize both visual and textual dialogue features to locate specific events.
The repository provides PyTorch code for the XML (Cross-modal Moment Localization) model; users should be prepared for high-dimensional video feature processing.