Wafer VQA Dataset is a multimodal benchmark built on the MixedWM38 wafer-map collection. It provides annotations for wafer map understanding, defect reasoning, and visual question answering. The dataset is organized into two annotation styles: tuple_generation for sequence-level optimization and stepwise_reasoning for supervised fine-tuning.
Use Cases
- Benchmarking multimodal models for wafer map understanding based on the annotated images and questions.
- Training models for defect reasoning in semiconductor manufacturing based on the stepwise dialogue annotations.
- Optimizing sequence-level responses for wafer map queries based on the tuple_generation annotation style.
Strengths
- Dataset is built on the MixedWM38 wafer-map collection, suggesting a foundation of established wafer imagery.
- Provides two distinct annotation styles (tuple_generation and stepwise_reasoning) tailored for different training approaches.
Limitations
- Description metadata is limited; actual data quality requires manual inspection after download.
- Column-level documentation is absent; field semantics must be inferred after download.
- Row count is unknown, which may limit suitability assessment.
Provenance
- Source
- huggingface
- Freshness
- Last updated 2026-05-07 07:21:53; freshness should be verified.