Diabetes Mellitus Prediction Using Ensemble Machine Learning Techniques
7 Pages Posted: 9 Jul 2020 Last revised: 12 Nov 2020
Date Written: 2020
Abstract
The healthcare industry is inflicted with the plethora of patient data which is being supplemented each day manifold. Researchers have been continually using this data to help the healthcare industry improve upon the way major diseases could be handled. They are even working upon the way the patients could be informed timely of the symptoms that could avoid the major hazards related to them. Diabetes is one such disease that is growing at an alarming rate today. In fact, it can inflict numerous severe damages; blurred vision, myopia, burning extremities, kidney and heart failure. It occurs when sugar levels reach a certain threshold, or the human body cannot contain enough insulin to regulate the threshold. Therefore, patients affected by Diabetes must be informed so that proper treatments can be taken to control Diabetes. For this reason, early prediction and classification of Diabetes are significant. This work makes use of Machine Learning algorithms to improve the accuracy of prediction of the Diabetes. A dataset obtained as an output of K-Mean Clustering Algorithm was fed to an ensemble model with principal component analysis and K-means clustering. Our ensemble method produced only eight incorrectly classified instances, which was lowest compared to other methods. The experiments also showed that ensemble classifier models performed better than the base classifiers alone. Its result was compared with the same Dataset being applied on specific methods like random forest, Support Vector Machine, Decision Tree, Multilayer perceptron, and Naïve Bayes classification methods. All methods were run using 10k fold cross-validation.
Keywords: Diabetes, Machine learning, Ensemble, Dataset
Suggested Citation: Suggested Citation