Comparative Analysis of Naïve Bayes and K-Nearest Neighbor (KNN) Algorithms in Stroke Classification
DOI:
10.47709/cnahpc.v6i3.4395Keywords:
Stroke Classification, Accuracy, Naive Bayes Algorithm, K-Nearest Neighbor (KNN)Dimension Badge Record
Abstract
Stroke, also known as cerebrovascular, is a type of Non-Communicable Disease (NCD). The symptoms of this disease arise due to a blockage (ischemic) or rupture (hemorrhagic) of a blood vessel that disrupts blood flow to the brain. This condition causes a lack of oxygen and nutrients to brain cells, resulting in damage and potentially death. This research aims to compare the use of Naive Bayes and K-Nearest Neighbor (K-NN) algorithms in classifying stroke diseases. The research process involves data collection, data validation, data preprocessing, data reading, data transformation, data splitting, model implementation, classification evaluation, application of Naive Bayes and K-Nearest Neighbor (K-NN) algorithms, and comparative analysis of results. The variables used in this study include: gender, age, hypertension, heart disease, ever married, work type, residence type, avg glucose level, bmi, smoking status, stroke. Sugar, BMI, Smoking Status, Stroke. Based on the experiments conducted, it was found that the Naive Bayes algorithm achieved an average accuracy rate of 91.67%, while the K-Nearest Neighbor (K-NN) algorithm achieved an average accuracy rate of 95.59%. Therefore, it can be concluded that the K-Nearest Neighbor (K-NN) algorithm has a higher average accuracy rate than the Naive Bayes algorithm, with a percentage difference in accuracy of 3.92%.
Downloads
Abstract viewed = 77 times
References
Fatmawati, K., dan Windarto, A. P. 2018. "Data Mining?: Penerapan Rapidminer Dengan K-Means Cluster Pada Daerah Terjangkit Demam Berdarah Dengue ( Dbd ) Berdasarkan Provinsi", 3(2), 173–178.
Haris 2022. "Metode Naïve Bayes Untuk Memprediksi Penyakit Stroke".
Maulid, R. 2021. "Kursus Belajar Data: Mengenal Apa Itu Missing Value". diambil 6 Maret 2023, dari https://www.dqlab.id/kursus-belajar-data-mengenal-apa-itu-missing-value.
Mutiarasari, D. 2019. "Ischemic Stroke: Symptoms, Risk Factors, And Prevention", 6(1).
Nugroho, K. S. 2020. "Menerapkan Model Klasifikasi Machine Learning pada RapidMiner". diambil 16 Maret 2023, dari https://ksnugroho.medium.com/menerapkan-model-machine-learning-pada-rapidminer-142259846e13.
Pambudi, R. E. S. F. 2022. "Klasifikasi Penyakit Stroke Menggunakan Algoritma Decision Tree C.45", 16(x), 221–226.
Putri, R. W., Ristyawan, A., dkk. 2018. "Comparison Performance of K-NN and NBC Algorithm for Classification of Heart Disease".
Rahmadani, D., dan Muzafar, A. A. 2022. "Comparative Analysis of C4 . 5 and CART Algorithms for Classification of Stroke", 198–206.
Rerung, R. R. 2018. "Penerapan Data Mining dengan Memanfaatkan Metode Association Rule untuk Promosi Produk", 3(1), 89–98. https://doi.org/10.31544/jtera.v3.i1.2018.89-98.
Rezkia, S. M. 2020. "Tingkatkan Kompetensi dengan Mengulik Sumber Dataset Untuk Membangun Model Pada Data Science". diambil 11 Februari 2023, dari https://www.dqlab.id/data-scientist-mengenal-dataset-datascience.
Saputra, D., Irmayani, W., dkk. 2021. "A Comparative Analysis of C4 . 5 Classification Algorithm , Naïve Bayes and Support Vector Machine Based on Particle Swarm Optimization ( PSO ) for Heart Disease Prediction", 2(2), 84–95. https://doi.org/10.25008/ijadis.v2i2.1221.
Sari, M., dan Ikhwani, Y. 2018. "Komparasi Algoritma K-Nearest Neighbor Dan Naive Baiyes Untuk Mendeteksi Dini Resiko Kanker Serviks Pada", 2(2).
Ulfatul, D., Rachmad, M., dkk. 2022. "Jurnal Smart Teknologi Perbandingan Metode K-Nearest Neighbor Dan Gaussian Naive Bayes Untuk Klasifikasi Penyakit Stroke", 3(4), 405–412.
Utomo, D. P. 2020. "Analisis Komparasi Metode Klasifikasi Data Mining dan Reduksi Atribut Pada Data Set Penyakit Jantung", 4(April), 437–444. https://doi.org/10.30865/mib.v4i2.2080.
Handayani, I., & Ikrimach, I. (2020). Accuracy Analysis of K-Nearest Neighbor and Naïve Bayes Algorithm in the Diagnosis of Breast Cancer. Jurnal Infotel, 12(4), 151–159. https://doi.org/10.20895/infotel.v12i4.547
Nababan, A. A., Sitompul, O. S., & Tulus. (2018). Attribute Weighting Based K-Nearest Neighbor Using Gain Ratio. Journal of Physics Conference Series, 1007, 12007. https://doi.org/10.1088/1742-6596/1007/1/012007
Oktafriani, Y. (2023). Analysis of Data Mining Applications for Determining Credit Eligibility Using Classification Algorithms C4.5, Naïve Bayes, K-Nn, and Random Forest. Asian Journal of Social and Humanities, 1(12), 1139–1158. https://doi.org/10.59888/ajosh.v1i12.119
Salsabila, N. A. (2023). Klasifikasi Tingkat Keparahan Korban Kecelakaan Lalu Lintas Di Kota Samarinda Menggunakan Algoritma K-Nearest Neighbor Dan Naive Bayes. Eksponensial, 14(2), 99. https://doi.org/10.30872/eksponensial.v14i2.1085
Shyla, & Bhatnagar, V. (2023). Perspicacious Apprehension of HDTbNB Algorithm Opposed to Security Contravention. Intelligent Automation & Soft Computing, 35(2), 2431–2447. https://doi.org/10.32604/iasc.2023.029126
Veziro?lu, M. (2024). Performance Comparison Between Naive Bayes and Machine Learning Algorithms for News Classification. https://doi.org/10.5772/intechopen.1002778
Downloads
ARTICLE Published HISTORY
How to Cite
Issue
Section
License
Copyright (c) 2024 Ida Bagus Ary Indra Iswara, Ida Bagus Gede Anandita, Maria Dahul
This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License.