Loading...
Loading...
Available on 1 platform
Sign in to view source links and access this dataset
73,893 short videos from the TRECVID VTT task, each ranging from 3 to 10 seconds in duration. The dataset includes between 2 and 5 human-written captions per video, created by dedicated annotators hired by NIST.
License is listed as 'other-license-specified'; users must check the specific terms before use. Data consists of MP4 video files and plain text caption files, requiring appropriate tools for processing.