JMMMU-Pro is an image-based Japanese multi-discipline multimodal understanding benchmark. It extends the JMMMU benchmark by composing question images and text into a single image, requiring integrated visual-textual understanding. The dataset was created by JMMMU and last updated on Hugging Face in December 2025.
Use Cases
- Benchmarking multimodal model performance on integrated visual-textual understanding tasks based on the described benchmark construction.
- Training models for Japanese language and image comprehension based on the multi-discipline scope.
- Evaluating AI reasoning across diverse academic subjects based on the benchmark's multi-discipline nature.
Strengths
- Dataset is explicitly designed as a benchmark for integrated visual-textual understanding.
- Extends a known benchmark lineage (MMMU to MMMU-Pro to JMMMU-Pro).
- Last updated date is explicitly recorded as 2025-12-17.
Limitations
- Description metadata is limited; actual data quality requires manual inspection after download.
- Column-level documentation is absent; field semantics must be inferred after download.
- Row count, file formats, and license are unknown, which may limit suitability assessment.
Provenance
- Source
- JMMMU
- Collection Method
- Constructed as a benchmark via Vibe Benchmark Construction, extending the JMMMU dataset.
- Time Range
- null
- Freshness
- Last updated 2025-12-17 02:34:14; freshness should be verified.
- Geography
- null