A formatted version of the TextVQA benchmark dataset, used for evaluating large multi-modality models. It was created by lmms-lab and last updated on March 8, 2024. The dataset is part of the lmms-eval pipeline for one-click model evaluations.
Use Cases
- Benchmarking model performance on visual question answering tasks based on the described evaluation suite.
- Evaluating a model's ability to read text within images based on the dataset's purpose.
- Accelerating the development of multi-modality models using the one-click evaluation pipeline mentioned in the description.
Strengths
- Dataset is part of a standardized evaluation suite (lmms-eval) for large multi-modality models.
- Last updated on March 8, 2024, indicating recent maintenance.
Limitations
- Column-level documentation is absent; field semantics must be inferred after download.
- Row count is unknown, which may limit suitability assessment.
- Description metadata is limited; actual data quality requires manual inspection after download.
Provenance
- Source
- lmms-lab
- Collection Method
- Formatted version of the original TextVQA dataset.
- Freshness
- Last updated 2024-03-08 05:07:57