Obesity Risk Prediction for Chinese Children and Adolescents, 2017-2020
by Zekai Chen·Updated 1mo ago
1.2 MB1files
Available on 1 platform
Sign in to view source links and access this dataset
Description
Chinese children and adolescents from 31 administrative regions were surveyed for physical activity, sedentary behavior, and sociodemographic factors. The dataset contains 38 candidate predictors used to develop and validate a machine learning model for obesity risk, with data from 35,016 participants in 2017-2018 and 3,495 in 2020. Author Zekai Chen published the supporting documentation under a CC-BY-4.0 license.
Use Cases
Developing obesity risk calculators based on parental BMI and lifestyle factors.
Analyzing the impact of moderate-to-vigorous physical activity (MVPA) on childhood obesity.
Evaluating the predictive importance of screen time (mobile devices, TV) on weekdays.
Comparing the performance of machine learning algorithms like Random Forest for health risk prediction.
Creating large-scale screening tools for healthcare agencies and schools.
Strengths
Data is based on a nationally representative sample of 35,016 participants.
Model was validated temporally with a separate dataset of 3,495 participants.
The study identifies and ranks the top 5 predictive features, including parental BMI and MVPA.
A web-based risk calculator derived from the model has been deployed.
Limitations
Column-level documentation is absent; field semantics must be inferred after download.
Row count for the underlying data tables is unknown, which may limit suitability assessment.
The dataset file is a 1.2 MB DOCX document; the actual structured data may require extraction.
Provenance
Source
Physical Activity and Fitness in China—The Youth Study (PAFCTYS)
Collection Method
Cross-sectional survey data collection.
Time Range
2017-2018 for development, 2020 for validation.
Freshness
Last updated 2026-05-05 04:12:51; freshness should be verified.
Geography
31 administrative regions across China.
The primary file format is DOCX; users may need to extract tabular data from the document.