StreamGaze is a benchmark dataset for evaluating Multimodal Large Language Models on gaze-based question-answering tasks. The dataset likely contains fixation metadata from three sources: EGTEA, EgoExoLearn, and HoloAssist, and QA tasks across past, present, and future contexts. It was created by daeunni and last updated on Hugging Face in March 2026.
Use Cases
- Benchmarking MLLM performance on gaze-based QA tasks based on the described video benchmark.
- Evaluating model understanding across temporal contexts (past, present, future) based on the description.
- Training models to interpret human fixation patterns in video streams based on the fixation metadata mentioned.
Strengths
- Benchmark is structured for evaluating MLLMs on a specific task: gaze-based QA.
- Dataset integrates metadata from three distinct sources: EGTEA, EgoExoLearn, and HoloAssist.
Limitations
- Column-level documentation is absent; field semantics must be inferred after download.
- Row count, file formats, and dataset size are unknown, which may limit suitability assessment.
Provenance
- Source
- Hugging Face dataset uploaded by daeunni.
- Freshness
- Last updated 2026-03-30 04:29:37; freshness should be verified.