Statistical Models and Machine Learning in Predicting Childhood Obesity and Related Metabolic Disorders ()
1. Introduction
Childhood obesity has emerged as one of the most pressing global public health challenges of the 21st century. Over the past four decades, the prevalence of overweight and obesity among children and adolescents has risen substantially worldwide. A landmark pooled analysis of 2416 population-based studies involving 128.9 million participants reported that the global age-standardised prevalence of obesity increased from 0.7% to 5.6% in girls and from 0.9% to 7.8% in boys between 1975 and 2016 [1]. More recent evidence confirms that this upward trend has continued, with an overall obesity prevalence of 8.5% (95% CI: 8.2 - 8.8) in children and adolescents up to 2023, representing a 1.5-fold increase compared with 2000-2011 [2]. A 2025 commentary in The Lancet further underscores that childhood obesity is now a major public health crisis both nationally and internationally, driven by an imbalance between caloric intake and energy expenditure, and compounded by genetic, behavioural and environmental factors [3].
The public health significance of early prediction cannot be overstated. Childhood obesity is strongly associated with a range of adverse health consequences that track into adulthood, including type 2 diabetes, dyslipidaemia, hypertension, and non-alcoholic fatty liver disease. According to the Global Burden of Disease Study 2021, high body-mass index (BMI) is one of the leading metabolic risk factors globally, with age-standardised disability-adjusted life-year (DALY) rates attributable to high BMI increasing by 15.7% between 2000 and 2021 [4]. These metabolic disorders not only reduce quality of life in childhood but also impose a substantial long-term burden on healthcare systems. Identifying at-risk children before obesity becomes established enables timely, targeted interventions that are more effective and less costly than treating established disease.
Statistical models and machine learning (ML) have demonstrated considerable value in improving early risk prediction. Traditional statistical approaches, such as logistic regression, longitudinal growth curve models, and structural equation modelling, provide interpretable frameworks for quantifying associations and testing causal pathways. Meanwhile, ML algorithms, including random forests, XGBoost, and neural networks, excel at capturing non-linear interactions and handling high-dimensional data from electronic health records, behavioural questionnaires, and multi-omics sources. These techniques facilitate the integration of diverse risk factors and enable accurate individual-level risk stratification. This review systematically summarises the application of statistical models and ML in predicting childhood obesity and related metabolic disorders, discusses current limitations and future directions, and aims to provide a theoretical reference for early warning, precision prevention, and health management.
2. Risk Factors for Childhood Obesity
Childhood obesity arises from a complex interplay of demographic, behavioral, and psychosocial factors. A comprehensive understanding of these risk domains is essential for developing accurate prediction models and targeted interventions.
2.1. Demographic and Socioeconomic Factors
Large-scale meta-analyses have consistently identified sociodemographic characteristics as fundamental determinants. A meta-analysis of over 45 million children from 154 countries reported higher obesity prevalence in high-income countries and those with Human Development Index scores ≥ 0.8 [2]. Parental education, household income, and neighborhood deprivation are strongly associated with childhood weight status [5]. Specifically, Lower socioeconomic status operates through multiple pathways, including reduced access to healthy foods and safe recreational spaces [5].
2.2. Dietary Behaviors and Physical Activity
Unhealthy dietary patterns and insufficient physical activity remain the most direct behavioral drivers. Multiple lifestyle-related predictors in their systematic review of 126 prediction models, noting that sleep duration, sleep quality, and eating speed are significant modifiable factors [6]. Higher birth weight, rapid infant weight gain, and absence of breastfeeding were among the seven strongest early-life risk factors [5]. Furthermore, Children with obesity have significantly higher risks of comorbidities such as hypertension and depression [2], indirectly reinforcing the behavioral-metabolic link.
2.3. Screen Time and Unhealthy Lifestyle
Sedentary behaviors, particularly excessive screen time, have emerged as independent risk factors. Controllable lifestyle factors―including sedentary behavior―are consistently incorporated into high-performing prediction models [6]. Prolonged screen exposure not only displaces physical activity but also increases exposure to food marketing and disrupts sleep patterns [5].
2.4. Psychological and Family Environmental Factors
Family environment and parental behaviors play crucial moderating roles. Russell et al. systematically reviewed studies on disadvantaged populations and found that parenting styles, feeding practices, and family routines are closely intertwined with child eating behaviors and weight trajectories [7]. However, they noted substantial clustering of risk factors by socioeconomic status and ethnicity, making it difficult to isolate independent effects [7]. Parental obesity, maternal prepregnancy BMI, and gestational weight gain are among the strongest predictors [5]. Table 1 summarizes key risk factors and their evidence levels based on the reviewed literature.
The majority of identified risk factors are modifiable and lifestyle-related [6], which underscores their value for both prediction modeling and preventive interventions. However, the clustering of socioeconomic and behavioral risks necessitates multi-domain assessment strategies [7].
![]()
Table 1. The key risk factors and evidence levels based on the reviewed literature.
3. Commonly Used Statistical Models
Statistical models play a fundamental role in understanding the multifactorial etiology of childhood obesity and quantifying the contributions of diverse risk factors. Traditional regression approaches remain widely used for their interpretability and robust inferential properties, while longitudinal methods capture dynamic growth trajectories, and structural equation modeling (SEM) enables examination of complex causal pathways involving latent constructs.
3.1. Traditional Regression Models
Logistic regression is the predominant method for binary outcomes, such as obesity (yes/no) or exceeding the 95th BMI percentile. Its primary advantage lies in producing odds ratios that are readily interpretable for clinical and policy audiences. A validation study across four demographically diverse U.S. cohorts applied multivariable logistic regression to predict childhood obesity at ages 4 - 6 years using five clinical variables (maternal age, maternal prepregnancy BMI, birth weight z-score, weight-for-age z-score change, and breastfeeding duration) [8]. The models achieved excellent discrimination, with area under the receiver operating characteristic curve (AUC) values ranging from 0.79 to 0.86 across cohorts, and negative predictive values ≥ 80% [8]. This demonstrates that logistic regression, even with a parsimonious set of predictors, can reliably identify high-risk children in routine clinical settings.
Multiple linear regression is commonly used when the outcome is a continuous measure, such as BMI z-score or percent body fat. Schreuder et al. employed linear regression to model excessive gain in BMI z-score between ages 2 and 5 - 7 years, comparing different growth measures (weight, weight-for-length, and BMI) measured at multiple time points during infancy [9]. Their analysis revealed that models incorporating the BMI peak and prepeak velocity achieved notably higher accuracy (derivation AUC: 0.765 - 0.855) than those relying solely on change over time (AUC: 0.706 - 0.795). However, the authors caution that performance degraded substantially upon external validation (AUC dropping by an average of 0.126), underscoring the importance of external validation and the limitations of linear regression when predictor-outcome relationships are non-linear or confounded by unmeasured factors [9].
3.2. Longitudinal Data Analysis
Childhood obesity develops over time, and repeated measurements offer richer insight than cross-sectional data. Linear mixed models (LMMs) account for within-subject correlation, handle irregularly spaced measurements, and accommodate missing data under missing-at-random assumptions. While not explicitly used in the provided longitudinal studies, LMMs underpin more specialized growth modeling approaches.
Growth curve models (GCMs), including latent class growth analysis (LCGA) and group-based trajectory modeling (GBTM), have become indispensable for identifying distinct BMI trajectory subgroups. Michael et al. used latent class growth mixture modeling to identify five BMI z-score trajectories from birth to age 6 years in a Singaporean mother-offspring cohort [10]. Two obesogenic trajectories were identified: an “early-acceleration” pattern characterized by elevated fetal abdominal growth and crossing of the obesity threshold by age 2 years, and a “late-acceleration” pattern approaching the obesity threshold by age 6 years. Both trajectories were associated with elevated cardiometabolic risk markers at age 6, including abdominal fat, liver fat, and insulin resistance [10]. Similarly, Zhou et al. applied GBTM to data from the Ma’anshan birth cohort (n = 2705) to examine physical growth trajectories before age 72 months, finding that children with persistently high BMI, waist circumference, or body fat trajectories had significantly higher risk of early adiposity rebound (ARR; relative risks ranging from 2.83 to 4.17) [11]. Notably, even infants with low BMI trajectories in the first two years who subsequently experienced rapid weight gain were also at elevated risk [11].
A key methodological insight from longitudinal studies is that trajectory-based prediction outperforms single time-point measurements. Huang et al. analyzed data from the Boston Birth Cohort (n = 3029) and identified four distinct BMI percentile trajectories from birth to age 18 years [12]. Using multinomial logistic regression, they demonstrated that BMI percentile trajectories during early childhood (birth to age 1 or 2 years) were superior to a single BMI measurement at age 1 or 2 years for predicting school-age overweight/obesity. Their imputation approach reduced missing data from 36.0% to 10.1%, highlighting a practical solution to a common challenge in longitudinal cohort studies [12]. Table 2 summarizes key methodological features and findings from the longitudinal studies reviewed.
3.3. Structural Equation Model (SEM)
SEM extends traditional regression by modeling relationships among observed and latent variables, estimating direct and indirect effects, and accounting for measurement error. This is particularly valuable for childhood obesity research, where constructs such as “family obesogenic environment” or “healthy lifestyle” cannot be directly observed but are indicated by multiple measured variables.
![]()
Table 2. Longitudinal trajectory studies in childhood obesity prediction.
Path analysis (SEM without latent variables) quantifies mediated pathways. Using cross-sectional data from 861 Argentine schoolchildren, Mendez et al. applied SEM to explore how socioeconomic status influences childhood obesity through health-related habits [13]. Their model showed acceptable fit (CFI = 0.979, RMSEA = 0.048) and revealed that healthy habits―particularly physical activity and maternal nutritional status―fully mediated the relationship between socioeconomic status and child obesity. Socioeconomic status positively influenced healthy habits, which in turn negatively influenced obesity factors (BMI, body fat, waist-to-height ratio) [13]. This causal pathway would have been obscured in a standard regression analysis.
Latent variable SEM enables testing of complex theoretical frameworks. Rahmaty et al. combined latent profile analysis (to derive feeding practice patterns) with SEM to examine associations with preschooler BMI z-score (BMIz) in 437 children [14]. Three feeding practice patterns were identified: Controlling, Balancing, and Regulating. The Regulating pattern (characterized by autonomy-promoting practices) was associated with significantly lower child BMIz (b = −0.09) compared to the Controlling pattern. Higher difficult temperament, higher caregiver BMIz, and caregiver desire for a thinner child were also associated with higher BMIz (all p < 0.05) [14]. Villodres et al. used SEM to examine relationships among screen time, sleep time, physical fitness, Mediterranean diet adherence, eating behaviors, and BMI in 653 Spanish preschoolers [15]. Negative associations emerged between screen time and physical fitness (p < 0.005), screen time and Mediterranean diet adherence (p < 0.005), and Mediterranean diet adherence and BMI (p = 0.033), while pro-intake behaviors were positively associated with BMI (p < 0.005). Multi-group analysis revealed that these relationships differed by child sex and BMI category [15]. Matias et al. employed SEM to test moderation, finding that longer fully breastfeeding duration attenuated the obesity risk associated with high gestational weight gain―a nuanced interaction effect that SEM handles elegantly [16].
Traditional regression models (logistic and linear) remain essential for their interpretability and ease of implementation in clinical risk scoring [8] [9]. Longitudinal methods―particularly growth curve and trajectory analyses―provide unparalleled insight into developmental patterns and enable early identification of children on obesogenic pathways [10]-[12]. SEM offers a powerful framework for testing mediational and moderational hypotheses involving latent constructs, thereby advancing causal understanding of childhood obesity [13]-[16]. The choice among these approaches should be guided by the research question, data structure, and the balance between predictive performance and interpretability.
4. Machine Learning Prediction Algorithms
Machine learning (ML) has emerged as a transformative approach for childhood obesity prediction, offering the ability to model complex, non-linear relationships and high-dimensional data without strong prior assumptions [17] [18]. Unlike traditional regression models that require prespecification of interaction and polynomial terms, ML algorithms automatically capture intricate patterns among demographic, behavioral, and physiological predictors. This section reviews key ML algorithms applied to pediatric obesity prediction, followed by a discussion of model evaluation metrics.
4.1. Decision Trees, Random Forest, and XGBoost
Decision trees partition the feature space into hierarchical if-then rules, producing interpretable flowcharts that map predictor values to weight status categories. However, single trees are prone to overfitting and instability. To address these limitations, ensemble methods combine multiple trees to improve predictive performance and robustness.
Random Forest (RF) builds a large collection of decorrelated decision trees through bootstrap aggregating (bagging) and random feature selection at each split. By averaging predictions across hundreds of trees, RF reduces variance and achieves superior generalization [19]. Liu et al. used RF as both a feature selection method and a predictive model in a population-based study of 3.86 million student-visits, demonstrating excellent stability in identifying key predictors of weight status up to five years in advance [19].
Extreme Gradient Boosting (XGBoost) represents a more advanced ensemble technique that builds trees sequentially, with each new tree correcting the errors of its predecessors. Through gradient-based optimization, regularization parameters, and handling of missing values, XGBoost has consistently outperformed other ML algorithms in pediatric obesity prediction [19] [20]. In a study involving 442,898 primary school students followed through secondary school, XGBoost achieved the highest long-term prediction accuracy (0.72 - 0.74) and macro-AUC (0.83 - 0.86) compared to decision trees, RF, k-nearest neighbors, and support vector machines [20]. The authors concluded that XGBoost enables accurate long-term weight status prediction using easily assessable variables such as weight, height, sex, age, and physical activity frequency.
4.2. Support Vector Machine (SVM)
Support Vector Machine (SVM) constructs an optimal hyperplane that maximizes the margin between different classes. Using kernel functions (e.g., radial basis function, polynomial), SVM can capture non-linear decision boundaries in high-dimensional feature spaces without explicitly transforming the data [17]. This property makes SVM particularly suitable for obesity risk classification when the relationship between predictors and outcomes is complex, but the sample size is moderate. However, SVM provides less interpretable results compared to tree-based methods and requires careful tuning of kernel parameters and regularization costs [18]. In comparative studies, SVM generally performs competitively but is often outperformed by gradient boosting methods in large-scale pediatric datasets [20].
4.3. Neural Networks and Deep Learning
Neural networks (NNs) consist of interconnected layers of artificial neurons that learn hierarchical representations of input data. A standard multilayer perceptron (MLP) with one or more hidden layers can approximate any continuous function, making it highly flexible for obesity prediction [17] [18]. Forte et al. developed a neural network model to classify obesity risk in 654 Portuguese adolescents aged 10 - 19 years using physical fitness variables (aerobic fitness, upper limb strength, sprint time) along with age and sex [21]. Their NN achieved 75% accuracy and an AUC of 64% after K-fold cross-validation, demonstrating moderate predictive capability.
Deep learning (DL) extends neural networks with many hidden layers, enabling automatic feature extraction from raw or high-dimensional data such as electronic health records (EHRs), wearable sensor data, or medical imaging. Gupta et al. built a customized sequential deep learning model using EHR data from 36,191 children aged 0 - 10 years to predict obesity onset within the next three years [22]. Their model achieved AUROC > 0.8 for all age subgroups (most around 0.9) and demonstrated robustness through temporal, geographical, and subgroup validation. Notably, the model relied exclusively on routinely collected EHR variables (e.g., weight, height, BMI records, clinical encounters) without requiring specialized prenatal or lifestyle data, greatly facilitating clinical integration. The authors emphasized that deep learning can serve as an objective screening tool to enable early lifestyle counseling.
4.4. Model Evaluation Metrics
Rigorous evaluation is essential to ensure that ML models generalize beyond their training data. The most common metrics are summarized in Table 3.
AUC (area under the receiver operating characteristic curve) and accuracy are the most frequently reported metrics [18] [19]. In a large-scale study, Liu et al. reported micro-AUCs of 0.96, 0.93, and 0.92 for 1-, 3-, and 5-year predictions, respectively, using XGBoost [19]. However, a critical limitation identified by a recent systematic review and meta-analysis is that while accuracy and AUC are commonly reported [23], no included study assessed model calibration―the agreement between predicted probabilities and observed event rates. The pooled AUC for logistic regression (0.75) and ML (0.76) showed no statistically significant difference, challenging the assumption that ML universally outperforms traditional methods when sample sizes are modest [23].
![]()
Table 3. Common performance metrics for childhood obesity prediction models.
Validation strategies include internal validation (e.g., cross-validation, bootstrapping) and external validation on temporally or geographically distinct cohorts. Gupta et al. exemplified rigorous validation by testing their deep learning model across different time periods, geographic regions, and demographic subgroups [22]. The authors of the meta-analysis emphasized that many ML studies are at high risk of bias due to inadequate validation and recommend that future research prioritize calibration assessment and external validation in diverse populations [23].
Tree-based ensemble methods (especially XGBoost) currently demonstrate the strongest performance in large-scale pediatric obesity prediction, while deep learning offers unique advantages when rich EHR or sensor data are available. SVM remains a viable alternative for smaller datasets. Regardless of algorithm choice, rigorous evaluation, including discrimination, calibration, and external validation, is essential for clinical translation [23].
5. Application in Early Warning and Risk Stratification
Effective translation of predictive models into clinical and public health practice requires three interconnected steps: accurate identification of high-risk populations, design of precision prevention strategies, and integration into health management policies.
5.1. Identification of High-Risk Populations
Systematic reviews have confirmed that prediction models for childhood obesity achieve moderate to good discrimination, with a pooled C-index of 0.769 (95% CI: 0.754 - 0.785) for overweight and 0.835 (95% CI: 0.792 - 0.879) for obesity in training sets [6]. Importantly, most predictive factors are modifiable lifestyle behaviours (e.g., sleep duration, eating speed), making them actionable in routine screening. While machine learning (ML) methods are increasingly popular, a recent meta-analysis found no statistically significant difference in area under the curve (AUC) between ML (pooled AUC: 0.76) and logistic regression (pooled AUC: 0.75) for obesity risk prediction [23]. This suggests that simpler, more interpretable models may be equally effective for initial risk stratification when calibration and external validation are properly conducted.
5.2. Precision Prevention and Targeted Intervention
Once high-risk children are identified, interventions should be tailored to individual and contextual drivers. For example, longitudinal analyses of BMI trajectories among children with obesity show that those living in areas with a higher area deprivation index (ADI) have significantly greater odds of following an increasing BMI trajectory [24]. This finding supports targeting not only individual behaviours but also neighborhood-level social determinants―particularly in rural settings where concentrated disadvantage is more common. Predictive models can thus guide resource allocation, referring high-risk children from deprived areas to multi-component, family-centered interventions, while lower-risk children may benefit from universal primary prevention.
5.3. Health Management and Policy Implications
At the policy level, digital health strategies provide a framework for embedding risk prediction into routine child health surveillance. Comparative policy analyses highlight the World Health Organization’s Global Strategy on Digital Health 2020-2025 as a key blueprint, though many national plans still lack elements such as knowledge management and health equity promotion [25]. Integrating validated prediction models into electronic health records―coupled with clear protocols for referral and follow-up―could bridge the gap between risk identification and effective intervention. As summarised in Table 4, successful implementation requires alignment of predictive accuracy, actionable risk factors, and supportive digital health policies.
Early warning and risk stratification for childhood obesity are most effective when evidence-based prediction models are coupled with targeted interventions addressing both individual and ecological determinants, supported by coherent digital health policies.
![]()
Table 4. Key considerations for translating childhood obesity prediction models into practice.
6. Limitations and Challenges
Despite promising advances, the application of statistical models and machine learning (ML) in childhood obesity prediction faces several critical limitations that hinder clinical translation and generalizability.
6.1. Small Sample Size and Single-Center Data
A substantial proportion of prediction studies suffer from small, non-representative samples and single-center data collection. In a systematic review and meta-analysis comparing logistic regression and ML for obesity risk prediction, 75% of included studies were at high risk of bias―predominantly due to inadequate sample sizes, lack of external validation, and no calibration assessment [23]. Single-center datasets often capture population-specific demographic and environmental patterns, leading to models that fail to generalize across diverse ethnic, socioeconomic, and geographic contexts [17].
6.2. Lack of Long-Term Follow-Up Cohorts
Most prediction models focus on short-term outcomes (1 - 3 years), yet childhood obesity is a chronic condition whose metabolic consequences―such as type 2 diabetes and fatty liver disease―emerge over decades. While ML can identify key predictors for 1-, 3-, and 5-year weight status, the number of required features increased with prediction horizon (from 6 to 13 predictors) [19], underscoring the need for long-term longitudinal data. However, such cohorts remain scarce due to high costs, attrition, and extended follow-up periods. Consequently, few models have been externally validated for predicting adolescent or young adult metabolic outcomes using early childhood predictors.
6.3. Poor Interpretability of Complex Models
Deep learning and ensemble methods, despite superior predictive performance reported in some studies, operate as “black boxes” [26]. Clinicians and parents are unlikely to trust or act upon predictions without understanding the driving risk factors for an individual child. The lack of model transparency―especially in deep neural networks―remains a major barrier to clinical adoption [17]. While post-hoc explainability techniques (e.g., SHAP) are emerging, they are not yet standard practice in pediatric obesity research, and their reliability in high-stakes health decisions is debated
7. Future Directions
Advancing the prediction of childhood obesity and related metabolic disorders requires parallel progress in four interconnected domains: multi-omics integration, large-scale longitudinal cohorts, personalized intelligent interventions, and federated multi-center collaboration. Each direction addresses specific limitations of current models and opens new avenues for precision prevention.
1) Multi-omics integration prediction: Integrating genomic, gut microbiome, and metabolomic data promises to uncover early biological pathways underlying obesity risk. Aparicio et al. constructed a quadripartite network linking SNPs in the FHIT gene (associated with obesity and type 2 diabetes) to microbial taxa, plasma metabolites, and BMI in children from 6 months to 8 years of age, identifying novel risk markers for insulin resistance [27]. Complementing this, Rafiq et al. employed DIABLO integrative analysis in a South Asian birth cohort, revealing that Akkermansia and GABA were negatively associated with early childhood overweight/obesity, while Lactobacillus and glutamic acid showed positive associations [28]. These multi-omics signatures enhance predictive accuracy beyond single-layer models.
2) Large-scale longitudinal cohort study: Robust prediction requires large, diverse, prospectively followed populations. Singh et al. leveraged the UK Millennium Cohort Study (over 10,000 children) to predict teenage obesity using earlier childhood measurements, achieving 77% sensitivity and specificity with easily obtainable features suitable for clinical and non-clinical settings [5]. Extending such cohort designs across multiple geographic and ethnic groups will improve generalizability.
3) Personalized intelligent intervention model: Beyond risk prediction, the next frontier is adaptive, individualized interventions. Explainable AI (XAI) methods are critical to make black-box models interpretable for clinicians and families. Khater et al. demonstrated a Random Forest model free of BMI parameters, achieving 86.5% accuracy for obesity prediction, using SHAP and partial dependence plots to reveal key lifestyle drivers such as meal frequency and technology usage [29]. Shen et al. applied SHAP values to an AdaBoost model (89.2% accuracy) to demystify decision-making, enabling causal insights into eating habits and physical condition [30]. These XAI techniques can power just-in-time adaptive interventions tailored to each child’s modifiable risk profile.
4) Federated learning and multi-center collaboration: Privacy concerns and population heterogeneity limit data sharing across institutions. FETA, a federated transfer learning framework, integrates heterogeneous data from multiple healthcare sites without sharing individual-level records [31]. Applied to eMERGE Network data for extreme obesity genetic risk prediction, FETA outperformed models trained on target-only or source-only data, reducing performance disparities in underrepresented populations [31]. This approach enables large-scale, privacy-preserving model development while accommodating population diversity.
Combining multi-omics discovery with large-scale longitudinal validation, interpretable AI-driven personalization, and federated collaborative networks will transform childhood obesity prediction from population-level risk scores to equitable, precision-focused early warning systems.
8. Conclusion
Childhood obesity and its metabolic consequences pose a persistent global health challenge, but the growing toolkit of statistical models and machine learning algorithms offers unprecedented opportunities for early prediction and precision prevention. Traditional regression models, longitudinal methods, and SEM remain valuable for understanding etiological pathways and generating interpretable risk scores. Machine learning approaches―particularly random forests, XGBoost, and neural networks―consistently achieve superior predictive accuracy by capturing non-linear and interactive effects inherent in obesity development. Future progress depends on overcoming current limitations: small, non-representative datasets; lack of long-term follow-up; model interpretability; and integration of multi-omics data. Collaborative, multi-center, and privacy-preserving frameworks such as federated learning will be essential to develop generalizable, clinically useful tools. Ultimately, the goal is not simply better prediction, but actionable risk stratification that leads to earlier, more effective, and personalized interventions―reversing the trajectory of childhood obesity before metabolic disorders take root.