ZhihaoNan's AtomBlock-WebUI dataset contains approximately 9,700 full-page web screenshots with bounding box annotations for 14 UI element categories. The dataset was generated via LLM-augmented HTML rendering and headless browser screenshot capture. It was last updated on April 18, 2026.
Use Cases
- Train UI element detection models based on bounding box annotations for 14 categories.
- Benchmark automated web scraping tools based on annotated semantic block-level landmarks.
- Develop synthetic data pipelines for UI testing based on programmatically aligned element annotations.
- Research UI layout parsing based on annotated primitive components like buttons and inputs.
Strengths
- Contains ~9,700 annotated full-page web screenshots.
- Annotations cover 14 UI element categories, including both primitive components and semantic landmarks.
Limitations
- Column-level documentation is absent; field semantics must be inferred after download.
- Row count is unknown, which may limit suitability assessment.
- Description metadata is limited; actual data quality requires manual inspection after download.
Provenance
- Source
- ZhihaoNan
- Collection Method
- Generated via LLM-augmented HTML rendering and headless browser screenshot capture.
- Freshness
- Last updated 2026-04-18 17:11:05; freshness should be verified.