MathCoder-VL is a series of open-source large multimodal models tailored for general math problem-solving. The dataset likely contains 8.6 million multimodal examples pairing images with code, supporting the development of models like FigCodifier-8B. It was created by MathLLMs and updated on October 11, 2025.
Use Cases
- Training image-to-code models based on multimodal examples
- Benchmarking multimodal mathematical reasoning capabilities
- Developing visual question answering systems for math problems
- Fine-tuning large multimodal models for specialized math tasks
Strengths
- Dataset name suggests a scale of 8.6 million examples
- Supports the development of specific models like FigCodifier-8B and MathCoder-VL-2B
- Last updated on October 11, 2025
Limitations
- Column-level documentation is absent; field semantics must be inferred after download
- Row count is unknown, which may limit suitability assessment
Provenance
- Source
- MathLLMs
- Freshness
- Last updated 2025-10-11 06:03:09