Sign in to view source links and access this dataset
Description
StreamGaze V2 is a streaming video benchmark for evaluating Multimodal Large Language Models on gaze-based question-answering tasks. The dataset, created by Peanuttoad, includes metadata from three source datasets: EGTEA, EgoExoLearn, and HoloAssist. It was last updated on Hugging Face on April 22, 2026.
Use Cases
Evaluating MLLM performance on past, present, and future context questions based on gaze sequences.
Benchmarking models on gaze-based sequence matching tasks as indicated by the QA file names.
Training or fine-tuning models to understand human visual attention in video streams.
Strengths
Designed as a benchmark for a specific and complex task: gaze-based QA across temporal contexts.
Integrates fixation metadata from three established source datasets (EGTEA, EgoExoLearn, HoloAssist).
Includes structured QA tasks like 'past_gaze_sequence_matching' as indicated by the file structure.
Limitations
Column-level documentation and sample data are unavailable, making field semantics unclear before download.
The dataset's total size and row count are unknown, limiting suitability assessment for large-scale training.
Data may reflect biases inherent to the three source video datasets used for metadata.
Provenance
Source
Hugging Face dataset by author Peanuttoad.
Collection Method
Likely compiled from fixation metadata of three existing video datasets: EGTEA, EgoExoLearn, and HoloAssist.
Freshness
Last updated 2026-04-22 17:12:24.
License information is unknown; users must verify terms of use before application.