Loading...
Loading...
Available on 1 platform
Sign in to view source links and access this dataset
A synthetic, paired-image benchmark for evaluating concept-based interpretability. Each item is an (original, synthetic) image pair where exactly one object class is removed, generated with FLUX.2 [dev] conditioned on COCO reference images. It accompanies the paper 'Evaluating the Interpretability of Sparse Autoencoders with Concept Annotations'.
License is unknown; terms of use must be verified before application.