Pisc Tr: Multimodal Question Answering with Chain-of-Thought Data

Name: Pisc Tr: Multimodal Question Answering with Chain-of-Thought Data
Creator: berhaan
Published: 2024-12-09T07:36:10
Keywords: Multimodal Qa, Image Text, Computer Vision, Cot, Multimodal, Visual Reasoning

by berhaanUpdated 6mo ago

Available on 1 platform

Sign in to view source links and access this dataset

Description

A multimodal dataset from the LLaVA-CoT project, likely containing image-question-answer pairs structured for visual reasoning tasks. The dataset includes a train.jsonl file with conversation data linking images to questions and answers, suggesting a format for training or evaluating vision-language models. It was authored by 'berhaan' and last updated on 2026-01-17.

Use Cases

Training vision-language models for visual question answering based on the described image-question-answer structure.
Evaluating model performance on chain-of-thought reasoning tasks based on the dataset's implied purpose.
Benchmarking multimodal AI systems on tasks requiring object counting or description in images, as suggested by the example question.

Strengths

Dataset is associated with a published research paper (LLaVA-CoT on arXiv) and a GitHub repository, indicating a research foundation.
The structure includes both image paths and conversational text, supporting multimodal analysis.

Limitations

Description metadata is limited; actual data quality requires manual inspection after download.
Row count is unknown, which may limit suitability assessment.
Column-level documentation is absent; field semantics must be inferred after download.

Provenance

Source: LLaVA-CoT GitHub Repository
Freshness: Last updated 2026-01-17 02:28:20; freshness should be verified.

The image data is referenced as a zip file requiring concatenation of parts ('cat image.zip.part-* > image.zip'), which may complicate initial setup.

Multimodal Multimodal Qa Image Text Computer Vision Cot Visual Reasoning

Related Datasets

Quality Score

D38

Description

42

Source

36

Reputation

43

Access

26

Community

25 downloads

4 likes

0 views

Dataset Info

Author: berhaan
Created: Dec 9, 2024
Updated: Jan 17, 2026
Last synced: May 21, 2026

Access

26

Community

25 downloads

4 likes

0 views

Dataset Info

Author: berhaan
Created: Dec 9, 2024
Updated: Jan 17, 2026
Last synced: May 21, 2026

Pisc Tr: Multimodal Question Answering with Chain-of-Thought Data

Description

Use Cases

Strengths

Limitations

Provenance

Related Topics

Related Datasets

Quality Score

Community

Dataset Info

Community

Dataset Info