Sign in to view source links and access this dataset
Description
Approximately 164,000 YouTube thumbnails paired with their corresponding video titles. The dataset was constructed by collecting public YouTube channel feeds, extracting video metadata, filtering and deduplicating entries, and downloading thumbnail images at scale. It was authored by l3afai and last updated on March 26, 2026.
Use Cases
Train image generation models based on thumbnail-image and title-text pairs.
Analyze visual trends in YouTube content based on thumbnail imagery.
Research multimodal AI tasks using the paired image-text structure.
Strengths
Approximately 164,000 data points provide a substantial sample size.
Data collection involved filtering and deduplication steps, which likely improves quality.
Limitations
Column-level documentation is absent; field semantics must be inferred after download.
Row count is unknown, which may limit suitability assessment.
Freshness should be verified as the last update timestamp is March 26, 2026.
Provenance
Source
huggingface
Collection Method
Constructed by collecting public YouTube channel feeds, extracting video metadata, filtering and deduplicating entries, and downloading thumbnail images at scale.