DataSalon

Discover quality datasets for AI training — aggregated from 40+ platforms, curated by AI.

ProductSearch Datasets Browse Topics Rankings Community API / MCP

ResourcesDocumentation Blog Changelog Status

LegalPrivacy Policy Terms of Service Cookie Policy

GUIrilla-Gold: Manually Annotated GUI Interaction Benchmark | DataSalon

Home Agriculture & FoodGUIrilla-Gold: Manually Annotated GUI Interaction Benchmark

Agriculture & Food

GUIrilla-Gold: Manually Annotated GUI Interaction Benchmark

Name: GUIrilla-Gold: Manually Annotated GUI Interaction Benchmark
Creator: macpaw-research
Published: 2026-01-22T10:54:06
Keywords: Screenshots, Benchmark, Computer Vision, Gui Interaction, Multimodal Benchmark, Instruction Following, Multimodal

by macpaw-research·Updated 1mo ago

Available on 1 platform

Description

GUIrilla-Gold is a manually annotated test set derived from the GUIrilla-Task collection. It contains screenshots paired with natural language instructions and corresponding actions. The dataset was created by macpaw-research and was last updated on 2026-05-04.

Use Cases

Train models for GUI automation based on the 'task' and 'action' fields.
Benchmark visual instruction-following models using the 'image' and 'task' pairs.
Evaluate model robustness by comparing performance on 'raw_task' versus cleaned 'task' instructions.
Develop computer vision models for GUI element detection using the 'image_cropped' field.

Strengths

Data is manually annotated, which likely indicates higher quality.
Contains both raw and cleaned versions of tasks ('raw_task' and 'task'), allowing for robustness analysis.

Limitations

Description metadata is limited; actual data quality requires manual inspection after download.
Row count is unknown, which may limit suitability assessment.
Column-level documentation is absent; field semantics must be inferred after download.

Provenance

Source: macpaw-research
Collection Method: Manually annotated from the GUIrilla-Task collection.
Freshness: Last updated 2026-05-04 09:00:03; freshness should be verified.

Multimodal Screenshots Benchmark Computer Vision Gui Interaction Multimodal Benchmark Instruction Following

Related Datasets

Quality Score

D40

Description

Source

Reputation

Quality Score

D40

Description

Source

Reputation

Access

Community

19 downloads

2 likes

0 views

Dataset Info

Author: macpaw-research
Created: Jan 22, 2026
Updated: May 4, 2026
Last synced: May 19, 2026

Access

Community

19 downloads

2 likes

0 views

Dataset Info

Author: macpaw-research
Created: Jan 22, 2026
Updated: May 4, 2026
Last synced: May 19, 2026

GUIrilla-Gold: Manually Annotated GUI Interaction Benchmark

Description

Use Cases

Strengths

Limitations

Provenance

Related Topics

Related Datasets

Quality Score

Community

Dataset Info

Community

Dataset Info