MMTutorBench: 770 Multimodal Math Tutoring Problems for AI Evaluation

Name: MMTutorBench: 770 Multimodal Math Tutoring Problems for AI Evaluation
Creator: Tangchiu
Published: 2026-05-22T13:47:44
Keywords: Math Tutoring, Ai Evaluation, Benchmark, Multimodal Benchmark, Educational Ai, Multimodal

by TangchiuUpdated 2mo ago

Available on 1 platform

Sign in to view source links and access this dataset

Description

MMTutorBench is the first multimodal benchmark for AI math tutoring, containing 770 carefully curated problems paired with 1,414 images. The dataset provides structured reference answers and per-instance rubrics for evaluating large language models along three pedagogical axes: Insight, Operation Formulation, and Operation Execution. It was created by Tangchiu and last updated on May 22, 2026.

Use Cases

Benchmarking AI tutoring models based on multimodal math problems and images
Evaluating pedagogical reasoning based on the Insight, Operation Formulation, and Operation Execution rubrics
Training or fine-tuning multimodal AI assistants using structured reference answers
Conducting research on LLM-as-judge evaluation methods for educational content

Strengths

Contains 770 carefully curated multimodal math tutoring problems
Includes 1,414 images paired with the problems
Provides structured reference answers and per-instance evaluation rubrics

Limitations

Column-level documentation is absent; field semantics must be inferred after download
Row count is unknown, which may limit suitability assessment
Description metadata is limited; actual data quality requires manual inspection after download

Provenance

Source: Tangchiu via Hugging Face
Collection Method: Curated benchmark collection, likely for research purposes as described in the associated paper.
Freshness: Last updated 2026-05-22 15:07:21

License is unknown; users should verify terms before use.

Multimodal Math Tutoring Ai Evaluation Benchmark Multimodal Benchmark Educational Ai

Related Datasets

Quality Score

C42

Description

48

Source

39

Reputation

45

Access

26

Community

859 downloads

1 likes

0 views

Dataset Info

Author: Tangchiu
Created: May 22, 2026
Updated: May 22, 2026
Last synced: Jun 8, 2026

Access

26

Community

859 downloads

1 likes

0 views

Dataset Info

Author: Tangchiu
Created: May 22, 2026
Updated: May 22, 2026
Last synced: Jun 8, 2026

MMTutorBench: 770 Multimodal Math Tutoring Problems for AI Evaluation

Description

Use Cases

Strengths

Limitations

Provenance

Related Topics

Related Datasets

Quality Score

Community

Dataset Info

Community

Dataset Info