Sign in to view source links and access this dataset
Description
synCUB is a synthetic, paired-image benchmark for evaluating concept-based interpretability. Each item is an (original, synthetic) image pair that differs in exactly one CUB attribute, such as changing a breast pattern from solid to spotted. Images are generated with FLUX.2 [dev] conditioned on CUB reference images, accompanying a paper on evaluating interpretability methods.
Use Cases
Evaluate concept-based interpretability methods based on controlled attribute changes in image pairs.
Benchmark model sensitivity to specific visual attributes based on the CUB bird attribute ontology.
Train or test models for fine-grained visual attribute manipulation based on synthetic image generation.
Strengths
Provides a controlled benchmark where image pairs differ in exactly one attribute, enabling precise evaluation.
Images are generated with a specific model, FLUX.2 [dev], conditioned on reference images from the CUB dataset.
Limitations
Description metadata is limited; actual data quality requires manual inspection after download.
Column-level documentation is absent; field semantics must be inferred after download.
Row count is unknown, which may limit suitability assessment.
Provenance
Source
Author jokl on Hugging Face.
Collection Method
Images generated with FLUX.2 [dev] conditioned on CUB reference images.
Freshness
Last updated 2026-06-18 20:49:50; freshness should be verified.
License is unknown; check the dataset page for usage restrictions.