Loading...
Loading...
Available on 1 platform
Sign in to view source links and access this dataset
UMI-VQA-8M is a large-scale visual question answering dataset built for UMI-style wrist-mounted fisheye observations. It contains 8 million visual question-answering samples and provides visual-language supervision for UMI observation scenarios. The dataset was created by TeleEmbodied and was last updated on the Hugging Face platform in June 2026.
License is unknown; terms of use must be verified before application.