Sign in to view source links and access this dataset
Description
TheoremQA is a dataset of 800 question-answer pairs created by human experts at TIGER-Lab. It covers over 350 theorems across mathematics, electrical engineering & computer science, physics, and finance. The dataset was uploaded to Hugging Face on May 15, 2024, and is intended as a benchmark for testing large language models on university-level problem-solving.
Use Cases
Benchmarking LLM performance on STEM theorem application based on the described QA pairs.
Training or fine-tuning models for advanced mathematical reasoning based on the university-level questions.
Analyzing model failure modes in physics and finance problem-solving based on the annotated theorems.
Strengths
800 QA pairs annotated by human experts, suggesting high quality.
Covers over 350 theorems across four major STEM domains.
Limitations
Column-level documentation is absent; field semantics must be inferred after download.
Row count is unknown, which may limit suitability assessment.
Provenance
Source
TIGER-Lab
Collection Method
Collected and annotated by human experts.
Time Range
null
Freshness
Last updated 2024-05-15 13:41:05; freshness should be verified.
Geography
null
License is unknown; restrictions should be verified before use.