Loading...
Loading...
Available on 1 platform
Sign in to view source links and access this dataset
Four large language models were evaluated on their ability to generate personalized exercise prescriptions using the FITT-VP framework. Claude 3.7 achieved the highest total score of 50.23 out of 60, while DeepSeek R1 scored the lowest at 40.30. The dataset, authored by Huan Feng and last updated in May 2026, contains the study results and analysis in a 374.6 KB document.
License is CC-BY-4.0. Data is presented as a study document (DOCX), not a structured data table.