Name: Obesity Risk Prediction for Chinese Children and Adolescents, 2017-2020
Creator: Zekai Chen
Published: 2026-05-05T04:12:51
License: CC-BY-4.0
Keywords: Machine Learning, Health Risk Prediction, Healthcare, Tabular, Childhood Obesity, Large Scale, Time Series, China, Physical Activity

Description

Chinese children and adolescents from 31 administrative regions were surveyed for physical activity, sedentary behavior, and sociodemographic factors. The dataset contains 38 candidate predictors used to develop and validate a machine learning model for obesity risk, with data from 35,016 participants in 2017-2018 and 3,495 in 2020. Author Zekai Chen published the supporting documentation under a CC-BY-4.0 license.

Use Cases

Developing obesity risk calculators based on parental BMI and lifestyle factors.
Analyzing the impact of moderate-to-vigorous physical activity (MVPA) on childhood obesity.
Evaluating the predictive importance of screen time (mobile devices, TV) on weekdays.
Comparing the performance of machine learning algorithms like Random Forest for health risk prediction.
Creating large-scale screening tools for healthcare agencies and schools.

Strengths

Data is based on a nationally representative sample of 35,016 participants.
Model was validated temporally with a separate dataset of 3,495 participants.
The study identifies and ranks the top 5 predictive features, including parental BMI and MVPA.
A web-based risk calculator derived from the model has been deployed.

Limitations

Column-level documentation is absent; field semantics must be inferred after download.
Row count for the underlying data tables is unknown, which may limit suitability assessment.
The dataset file is a 1.2 MB DOCX document; the actual structured data may require extraction.

Provenance

Source: Physical Activity and Fitness in China—The Youth Study (PAFCTYS)
Collection Method: Cross-sectional survey data collection.
Time Range: 2017-2018 for development, 2020 for validation.
Freshness: Last updated 2026-05-05 04:12:51; freshness should be verified.
Geography: 31 administrative regions across China.

The primary file format is DOCX; users may need to extract tabular data from the document.

Tabular Time Series 🇨🇳 China Machine Learning Health Risk Prediction Healthcare Childhood Obesity Large Scale Physical Activity

Obesity Risk Prediction for Chinese Children and Adolescents, 2017-2020

Description

Use Cases

Strengths

Limitations

Provenance

Related Topics

Related Datasets

Quality Score

Community

Dataset Info

Community

Dataset Info