Investigation Related to Performance of KNN, Logistic Regression and XGBoost on Diabetes Prediction
- DOI
- 10.2991/978-94-6463-300-9_70How to use a DOI?
- Keywords
- Machine Learning; Algorithms; Diabetes
- Abstract
This study uses three different machine learning algorithms to build model for diabetes prediction and compares the accuracy of each model, and these algorithms are K Nearest Neighbors (KNN), Logistic Regression, and Extreme Gradient Boosting (XGBoost). The goal for this study is to find a precise algorithm for diabetes prediction, and this is really conductive to diagnosis of diabetes for doctors. In this way, patients can get apt treatment on time. Before building models, the dataset is pre-processed by standard scaling and Synthetic Minority Over-sampling (SMOTE) to balance the class. Then, Grid Search CV is used to find the best parameter for the model. Finally, the results show that KNN has an accuracy of 82%, followed by XGBoost which is 79.87% and Logistic Regression which is 75.5%. The advantage of KNN algorithm is that it only considers the distance between training sample and the new sample that is going to be predicted without any other computation. As a result, KNN demonstrated the best performance among these three algorithms. In the future, this study can expand the size of the dataset and try more parameters in order to achieve a higher accuracy on the model for diabetes prediction.
- Copyright
- © 2023 The Author(s)
- Open Access
- Open Access This chapter is licensed under the terms of the Creative Commons Attribution-NonCommercial 4.0 International License (http://creativecommons.org/licenses/by-nc/4.0/), which permits any noncommercial use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license and indicate if changes were made.
Cite this article
TY - CONF AU - Jiaguo Lin PY - 2023 DA - 2023/11/27 TI - Investigation Related to Performance of KNN, Logistic Regression and XGBoost on Diabetes Prediction BT - Proceedings of the 2023 International Conference on Image, Algorithms and Artificial Intelligence (ICIAAI 2023) PB - Atlantis Press SP - 670 EP - 676 SN - 2352-538X UR - https://doi.org/10.2991/978-94-6463-300-9_70 DO - 10.2991/978-94-6463-300-9_70 ID - Lin2023 ER -