Obesity is one of the major health challenges of our century. Indeed, it affects adults, adolescents and even very young children, and is strongly linked to diseases such as type 2 diabetes, cardiovascular disease and some cancers. The latest estimates from the World Health Organisation show that hundreds of millions of people worldwide are already living with overweight or obesity, and the numbers are still increasing. This makes early identification of people at higher risk a priority, because if we understand who is most vulnerable and why, we can intervene earlier with targeted prevention, lifestyle changes and medical support.
In this context, it has been recently developed by Shaban and colleagues (Shaban et al. Sci Rep 2025), the ObeRisk, a new AI framework designed to predict an individual’s susceptibility to obesity starting from simple information on personal characteristics and lifestyle. The core idea was to use machine learning to learn from a large dataset of more than twenty thousand individuals, each described by demographic variables such as age and gender, physical measurements such as height and weight, and behavioural factors such as eating habits, physical activity, snacking, alcohol consumption and mode of transportation. By analysing how these factors combined and related to different levels of weight status, the system estimated the probability that a person belonged to categories such as normal weight, insufficient weight, overweight or different types of obesity.
To achieve this, the authors first cleaned and prepared the data. Missing values were filled in a careful way, strange or extreme values called outliers were removed, and all variables were converted into numerical form and
normalised so that they can be processed correctly by the algorithms. This step was crucial because machine learning models were very sensitive to the quality of the data they received. Once the data were preprocessed, the next challenge is to decide which variables are really important. Not all the information collected was equally useful; some features added noise, making models more complex and may even reduce accuracy. For this reason, the authors propose a new feature selection method called the Entropy-Controlled Quantum Bat Algorithm. Although the name sounds very technical, the principle was intuitive: it involved an intelligent search strategy that explored many possible combinations of variables and sought to identify the subset that led to the best prediction of obesity risk. It utilised concepts from information theory, such as entropy, to measure the diversity and informativeness of the current solutions, and ideas inspired by quantum mechanics to help the algorithm escape local optima and avoid getting stuck. In practice, it learnt to balance exploration of new possibilities and exploitation of promising ones.
Using this method, the system identified a set of key features with a clear clinical meaning. These included weight and age, which were directly linked to body mass index and long-term risk; gender, which influenced body fat distribution and hormonal factors; family history of overweight, which reflected genetic and shared environmental influences; the frequent consumption of high-calorie foods; the tendency to snack between meals; and the level of physical activity. Together, these variables described a person’s lifestyle and biological background in a way that was very informative for predicting obesity. Once the most relevant features were selected, they were passed to a group of different machine learning models, including logistic regression, tree-based gradient boosting methods such as LightGBM and XGBoost, neural networks, k-nearest neighbours and support vector machines. Instead of choosing a single “best” algorithm, ObeRisk combined them through a mechanism known as majority voting: each model produced its own prediction, and the final classification corresponded to the category that received the most votes. This ensemble strategy generally led to more robust and accurate results because errors made by one model were compensated for by others.
Therefore, the performance of ObeRisk was very high. In the experiments reported in the article, the framework achieved an accuracy of about 97%, with similarly strong values for precision, sensitivity and F-measure. These numbers meant that in most cases, the system correctly identified the weight category of individuals in the test dataset. The authors also showed that their new feature selection algorithm outperformed several other state-of-the-art methods and significantly improved the performance of even simple models, confirming that choosing the right variables was as important as the choice of the learning algorithm itself.
However, beyond the technical results, it is quite important to reflect on the ethical implications of using AI to label individuals as “at risk” for obesity. Such labels could influence how people see themselves and how they are seen by others, contributing to stigma, discrimination or psychological distress if they are communicated without care. For this reason, the authors
underlined the importance of transparency in how predictions were generated, the need to explain that risk wasprobabilistic and not deterministic, and the responsibility of healthcare professionals to communicate results in an empathetic and supportive way. They also stressed the importance of checking bias models and involving diverse stakeholders in their development.
The obeRisk, which was not intended for clinical diagnosis, offered a concrete example of how AI could be integrated into medicine, not to replace clinicians, but to provide them with additional tools. A system like ObeRisk works as a scalable screening tool which could be used in clinical or public health settings to flag individuals who might benefit from early nutritional counselling, physical-activity programs or further metabolic evaluation. At the same time, it shows that high predictive performance must go hand in hand with careful design of data pipelines, advanced feature selection strategies and explicit attention to ethics and fairness. Finally, the study demonstrated that AI could play a meaningful role in obesity prevention by transforming routinely collected lifestyle and anthropometric data into actionable predictions, while reminding us that technology must remain firmly anchored to human values and professional responsibility.
References. Shaban, W.M., El-Din Moustafa, H. & El-Seddek, M.M. Machine learning framework for predicting susceptibility to obesity. Sci Rep 15, 35040 (2025). https://doi.org/10.1038/s41598-025-20505-9

