Four categories of block diagram images—BD-EnKo, CBD, FC_A, and FC_B—are referenced, though only the BD-EnKo subset is provided for summarization research. It facilitates the study of local-global fusion for visual-textual integration as presented at ACL 2024.
Use Cases
- Train image-to-text models for block diagram summarization using the BD-EnKo image and summary pairs
- Develop local-global fusion models to analyze the relationship between visual diagram nodes and textual summaries
- Benchmark diagrammatic reasoning capabilities of multi-modal large language models on the BD-EnKo subset
Strengths
- Includes the BD-EnKo dataset introduced in the ACL 2024 paper 'Unveiling the Power of Integration'
- Focuses on block diagram summarization through visual and textual data integration
- References four distinct source datasets: BD-EnKo, CBD, FC_A, and FC_B