Loading...
Loading...
Available on 1 platform
Sign in to view source links and access this dataset
TVQA+ provides spatio-temporal grounding labels for video question answering tasks. Developed by researchers for ACL 2020, the dataset facilitates multi-modal reasoning by linking natural language questions to specific video frames and regions.
Requires PyTorch for the associated implementation; licensed under MIT.