FunBench is a novel visual question answering benchmark designed to evaluate multimodal large language models' fundus reading skills. The dataset was created by AIMClab-RUC and last updated on May 14,我们发现了一个问题。 2025. Code and a description are available on a linked GitHub repository.
Use Cases
- Benchmarking the diagnostic reasoning capabilities of MLLMs based on fundus images and associated questions.
- Evaluating the visual grounding and medical knowledge of vision-language models based on the described VQA tasks.
- Training or fine-tuning models for automated analysis of retinal conditions based on the benchmark's fundus reading focus.
Strengths
- Specifically designed for a novel and clinically relevant task: evaluating MLLMs on fundus reading.
- Associated code and description are publicly available on GitHub, facilitating reproducibility.
Limitations
- Dataset scale, column definitions, and sample data are unknown, limiting suitability assessment.
- Column-level documentation is absent; field semantics must be inferred after download.
Provenance
- Source
- AIMClab-RUC
- Collection Method
- Likely created as a benchmark for the MICCAI 2025 conference.
- Time Range
- null
- Freshness
- Last updated 2025-05-14 01:53:09; freshness should be verified.
- Geography
- null