Joy Captioning 20250408A: 100K-1M Image-Text Pairs for VLM Training

Name: Joy Captioning 20250408A: 100K-1M Image-Text Pairs for VLM Training
Creator: fancyfeast
Published: 2025-04-09T00:42:28
Keywords: Librarypolars, Librarydask, Languageen, Task Categoriesvisual Question Answering, Modalitytext, Size Categories100 Kn1 M, Librarymlcroissant, Librarydatasets, Parquet, Regionus, Vlm, Captioning, Vqa, Licensemit, Joycaption

by fancyfeastUpdated 4mo ago

Available on 1 platform

Sign in to view source links and access this dataset

Description

Joy Captioning 20250408A contains between 100,000 and 1,000,000 image-text pairs used for the initial training of the JoyCaption Beta One vision-language model. Created by fancyfeast and updated in early 2026, the collection focuses on detailed image descriptions and visual question-answering tasks. The data includes a mix of human-written and machine-generated text, explicitly labeled for provenance.

Use Cases

Training vision-language models for image captioning using the description pairs
Filtering for high-quality ground truth data using the is_human column
Developing visual question-answering systems using the VQA subset

Strengths

Scale of 100,000 to 1,000,000 records
Explicit is_human flag for distinguishing synthetic from organic text
MIT licensed for permissive reuse

Limitations

Contains machine-written and automated text which may introduce synthetic data biases
Lack of detailed documentation regarding the primary image sources

Provenance

Source: fancyfeast via Hugging Face
Collection Method: Mixed collection including human-written, automated, and machine-generated text
Freshness: Last updated February 2026.
Geography: United States

This dataset is the specific training source for JoyCaption Beta One; users should be aware that it is provided in Parquet format and requires the is_human filter to isolate non-synthetic responses.

Parquet Librarypolars Librarydask Languageen Task Categoriesvisual Question Answering Modalitytext Size Categories100 Kn1 M Librarymlcroissant Librarydatasets Regionus Vlm Captioning Vqa Licensemit Joycaption

Related Datasets

Quality Score

D37

Description

39

Source

36

Reputation

47

Access

22

Community

110 downloads

7 likes

0 views

Dataset Info

Author: fancyfeast
Created: Apr 9, 2025
Updated: Feb 24, 2026
Last synced: Jun 13, 2026

Access

22

Community

110 downloads

7 likes

0 views

Dataset Info

Author: fancyfeast
Created: Apr 9, 2025
Updated: Feb 24, 2026
Last synced: Jun 13, 2026

Joy Captioning 20250408A: 100K-1M Image-Text Pairs for VLM Training

Description

Use Cases

Strengths

Limitations

Provenance

Related Topics

Related Datasets

Quality Score

Community

Dataset Info

Community

Dataset Info