A synthetic dataset of 150,000 video frames annotated by GPT-4o for training frame sampling models. It features dense coverage, annotating approximately 20% of all frames with relevance scores, and provides fine-grained confidence assessments on a 1 to 5 scale. The dataset was created by author yaolily and last updated on September 4, 2025.
Use Cases
- Train video frame sampling models based on the provided relevance scores.
- Benchmark frame selection algorithms based on the dense coverage of annotated frames.
- Develop confidence-aware video processing pipelines based on the fine-grained confidence scores.
Strengths
- Contains 150,000 video frames.
- Annotates approximately 20% of all frames with relevance scores.
- Provides fine-grained confidence scores on a 1 to 5 scale.
Limitations
- Dataset size in rows and file formats are unknown.
- Column-level documentation is absent; field semantics must be inferred after download.
- Data may reflect bias inherent to the synthetic generation and annotation process.
Provenance
- Source
- huggingface
- Collection Method
- Synthetic data generation and annotation by GPT-4o.
- Freshness
- Last updated 2025-09-04 03:32:07.