Loading...
Loading...
Available on 1 platform
Sign in to view source links and access this dataset
ASID-1M is a large-scale audiovisual instruction dataset designed to support universal video understanding through fine-grained, controllable supervision. It addresses the limitations of traditional monolithic captions by providing attribute-structured and quality-verified data. The dataset aims to improve coverage of both visual and auditory elements within video content for more precise model training.