Loading...
Loading...
Available on 1 platform
Sign in to view source links and access this dataset
A benchmark for evaluating 3D point motion in video, covering egocentric and third-person scenes across three source datasets. It contains 2,565 video clips from DAVIS and HOT3D datasets, each paired with per-object tracked surface points and a human-verified natural-language caption. The dataset was created by AllenAI and last updated on 2026-06-17.
License is unknown; terms of use must be verified before application.