Classification and regression tree model for diabetes prediction
International Journal of Informatics and Communication Technology
Abstract
Diabetes mellitus is characterized by excessive blood glucose that occurs when the pancreas malfunctions while producing insulin. High blood glucose levels can cause chronic damage to organs, particularly the eyes and kidneys. Diabetes prediction models traditionally use a variety of machine learning (ML) algorithms by combining data from the glucose levels, patient health parameters, and other biomarkers. Prior research on diabetes prediction using various algorithms, such as support vector machine (SVM) and decision tree (DT) models, demonstrates an accuracy rate of approximately 70%, which is relatively modest. Therefore, in this study, a classification and regression tree (CART) multiclassifier model has been proposed to improve the accuracy of diabetes prediction, which is based on three classes: non-diabetic, pre-diabetic, and diabetic. The study involved data preprocessing steps, hyperparameter tuning, and evaluation of performance metrics. The model achieved 97% accuracy while utilizing the value of 5 for the number of leaves per node, the value of 10 for the maximum number of splits, and deviance as the split criterion, which also resulted in a precision of 98%, recall of 97%, and F1-score of 98%, showing that the proposed multiclassifier model can accurately predict diabetes. In conclusion, the proposed CART model with the best hyperparameter setting can enable the highest accuracy in predicting diabetes classes.
Discover Our Library
Embark on a journey through our expansive collection of articles and let curiosity lead your path to innovation.





