WikiArt Captions Subset for Multimodal Art Retrieval

Name: WikiArt Captions Subset for Multimodal Art Retrieval
Creator: Lizagrin
Published: 2025-10-05T16:12:24
Keywords: Wikiart, Multimodal Retrieval, Image Captioning, Art, Synthetic, Multimodal

by LizagrinUpdated 9mo ago

Available on 1 platform

Sign in to view source links and access this dataset

Description

A curated subset of 6,000 paintings from the WikiArt collection, created by Lizagrin and last updated in October 2025. It was developed for multimodal art retrieval, combining visual, textual, and semantic information. Each artwork record includes an image row index and an automatically generated caption using the BLIP model.

Use Cases

Training multimodal retrieval models based on the pairing of artwork images and generated captions.
Benchmarking image captioning models on the domain of fine art paintings.
Conducting art historical analysis using machine-generated textual descriptions of visual content.
Developing educational tools that link visual art with descriptive text.

Strengths

Contains 6,000 curated paintings from the established WikiArt collection.
Provides multimodal data combining visual artwork with machine-generated textual captions.
Created specifically for the concrete research task of multimodal art retrieval.

Limitations

Description metadata is limited; actual data quality requires manual inspection after download.
Column-level documentation is absent; field semantics must be inferred after download.
Row count beyond the 6,000 subset is unknown, which may limit suitability assessment.

Provenance

Source: WikiArt collection.
Collection Method: Curated subset with captions automatically generated using the BLIP model.
Freshness: Last updated 2025-10-08 19:41:39; freshness should be verified.

License is unknown and should be verified before use.

Multimodal Wikiart Multimodal Retrieval Image Captioning Art Synthetic

Related Datasets

Quality Score

D38

Description

42

Source

39

Reputation

37

Access

26

Community

5 downloads

1 likes

0 views

Dataset Info

Author: Lizagrin
Created: Oct 5, 2025
Updated: Oct 8, 2025
Last synced: May 29, 2026

Access

26

Community

5 downloads

1 likes

0 views

Dataset Info

Author: Lizagrin
Created: Oct 5, 2025
Updated: Oct 8, 2025
Last synced: May 29, 2026

WikiArt Captions Subset for Multimodal Art Retrieval

Description

Use Cases

Strengths

Limitations

Provenance

Related Topics

Related Datasets

Quality Score

Community

Dataset Info

Community

Dataset Info