Sign in to view source links and access this dataset
Description
13,786 public-domain artworks form the painterly core of the OpenArt collection. The dataset includes 9,107 paintings and illustrations, 4,596 photographed objects, and 83 unclassified works, each paired with a structured VLM caption. It was created by author jaddai and last updated on Hugging Face in May 2026.
Use Cases
Training image-captioning models based on structured VLM captions for paintings and illustrations.
Fine-tuning generative art models on a collection focused on painterly techniques like drawing, etching, and oil painting.
Analyzing art medium and attribution metadata for computational art history research.
Building search or recommendation systems for public-domain art collections.
Strengths
Contains 13,786 individual works, providing a substantial collection for model training.
Includes structured metadata such as medium, attribution, and inscriptions for each work.
Focuses on 9,107 paintings and illustrations, offering a curated set of 2-D fine-art techniques.
Limitations
Column-level documentation is absent; field semantics must be inferred after download.
Row count is unknown, which may limit suitability assessment.
Description metadata is limited; actual data quality requires manual inspection after download.
Provenance
Source
OpenArt family of open, public-domain art datasets.
Collection Method
Likely aggregated from public-domain art sources; each work is paired with a structured VLM caption.
Freshness
Last updated 2026-05-28 09:29:55; freshness should be verified.
License is unknown and must be verified before use.