ac

Comparison of Decision Tree and Linear Regression Algorithms in the Case of Spread Prediction of COVID-19 in Indonesia

Authors

  • Darwin Universitas Prima Indonesia
  • Dwiky Christian Universitas Prima Indonesia
  • Wilson Chandra Universitas Prima Indonesia
  • Marlince Nababan Universitas Prima Indonesia

DOI:

10.47709/cnahpc.v4i1.1234

Keywords:

CART, COVID-19, Data Mining, Decision Tree, Linear Regression

Dimension Badge Record



Abstract

COVID-19 is a disease that was first discovered in Wuhan, China and caused the 2019-2020 coronavirus pandemic. This virus can cause respiratory tract infections such as flu when infecting humans. According to Ministry of Health of the Republic of Indonesia, the number of confirmed cases of COVID-19 in Indonesia at March 2021 is 1,511,712 with 40,858 deaths and 1,348,330 recovered. For that, Indonesia is declared to have the highest confirmed cases in ASEAN. Several studies have been carried out to handle some cases by using the data mining techniques such as Decision Tree or Linear Regression algorithm, as example to classify the respiratory diseases and predict pregnancy hypertension. In this study, we tried to analyze COVID-19 cases in Indonesia and conducted an experiment of predicting COVID-19 new cases with the Decision Tree (CART) and Linear Regression algorithms. Then we will compare the values of these two algorithms by using R2 Score to evaluate the prediction performance. The results of this analysis state that DKI Jakarta province has the highest number of positive cases, cures and deaths in Indonesia. The value of the comparison results from the R2 Score obtained in the Decision Tree algorithm reached 95.69% (training) and 92.15% (testing) while the Linear Regression algorithm reached 79.93% (training) and 77.25% (testing).

Downloads

Download data is not yet available.
Google Scholar Cite Analysis
Abstract viewed = 641 times

References

Alves, M. A., Castro, G. Z., Oliveira, B. A. S., Ferreira, L. A., Ramírez, J. A., Silva, R., & Guimarães, F. G. (2021). Explaining machine learning based diagnosis of COVID-19 from routine blood tests with decision trees and criteria graphs. Computers in Biology and Medicine, 132. https://doi.org/10.1016/j.compbiomed.2021.104335

Annisa, D. (2021). Situasi Terkini Perkembangan Novel Coronavirus (COVID-19) 19 Agustus 2021. Infeksi Emerging Kementerian Kesehatan RI, 1–4. https://covid19.kemkes.go.id/situasi-infeksi-emerging/situasi-terkini-perkembangan-coronavirus-disease-covid-19-20-agustus-2021

Baharuddin, M. M., Azis, H., & Hasanuddin, T. (2019). Analisis Performa Metode K-Nearest Neighbor Untuk Identifikasi Jenis Kaca. ILKOM Jurnal Ilmiah, 11(3), 269–274. https://doi.org/10.33096/ilkom.v11i3.489.269-274

Bender, L. (2020). Pesan dan Kegiatan Utama Pencegahan dan Pengendalian COVID-19 di Sekolah. Unicef, 1–14.

Dwitri, N., Tampubolon, J. A., Prayoga, S., Ilmi Zer, F., & Hartama, D. (2020). Penerapan Algoritma K-Means Dalam Menentukan Tingkat Penyebaran Pandemi Covid-19 Di Indonesia. Jti (Jurnal Teknologi Informasi), 4(1), 101–105.

Indah Prabawati, N., Widodo, & Ajie, H. (2019). Kinerja Algoritma Classification And Regression Tree (Cart) dalam Mengklasifikasikan Lama Masa Studi Mahasiswa yang Mengikuti Organisasi di Universitas Negeri Jakarta. PINTER?: Jurnal Pendidikan Teknik Informatika Dan Komputer, 3(2), 139–145. https://doi.org/10.21009/pinter.3.2.9

Nabila, Z., Rahman Isnain, A., & Abidin, Z. (2021). Analisis Data Mining Untuk Clustering Kasus Covid-19 Di Provinsi Lampung Dengan Algoritma K-Means. Jurnal Teknologi Dan Sistem Informasi (JTSI), 2(2), 100. http://jim.teknokrat.ac.id/index.php/JTSI

Panggabean, D. S. O., Buulolo, E., & Silalahi, N. (2020). Penerapan Data Mining Untuk Memprediksi Pemesanan Bibit Pohon Dengan Regresi Linear Berganda. JURIKOM (Jurnal Riset Komputer), 7(1), 56. https://doi.org/10.30865/jurikom.v7i1.1947

Pramadhani, E. E., & Tedy, S. (2014). Penerapan Data Mining untuk Klasifikasi Prediksi Penyakit ISPA (Infeksi Saluran Pernapasan Akut) dengan Algoritma Decision Tree (ID3). Jurnal Sarjana Teknik Informatika, 2(1), 831–839.

Renaud, O., & Victoria-Feser, M. P. (2010). A robust coefficient of determination for regression. Journal of Statistical Planning and Inference, 140(7), 1852–1862. https://doi.org/10.1016/j.jspi.2010.01.008

Shalabi, L. Al, Shaaban, Z., & Kasasbeh, B. (2006). Data Mining: A Preprocessing Engine. Journal of Computer Science, 2(9), 735–739. https://doi.org/10.3844/jcssp.2006.735.739

Sindi, S., Ningse, W. R. O., Sihombing, I. A., Ilmi R.H.Zer, F., & Hartama, D. (2020). Analisis algoritma K-Medoids clustering dalam pengelompokan penyebaran Covid-19 di Indonesia. Jti (Jurnal Teknologi Informasi), 4(1), 166–173.

Sodik, F., Dwi, B., & Kharisudin, I. (2020). Perbandingan Metode Klasifikasi Supervised Learning pada Data Bank Customers Menggunakan Python. Jurnal Matematika, 3, 689–694.

Supriatna, E. (2020). Wabah Corona Virus Disease (Covid 19) Dalam Pandangan Islam. SALAM: Jurnal Sosial Dan Budaya Syar-I, 7(6), 555–564. https://doi.org/10.15408/sjsbs.v7i6.15247

Sutoyo, I. (2018). Implementasi Algoritma Decision Tree Untuk Klasifikasi Data Peserta Didik. Jurnal Pilar Nusa Mandiri, 14(2), 217. https://doi.org/10.33480/pilar.v14i2.926

Vadyala, S. R., Betgeri, S. N., Sherer, E. A., & Amritphale, A. (2021). Prediction of the number of COVID-19 confirmed cases based on K-means-LSTM. Array, 11, 100085. https://doi.org/10.1016/j.array.2021.100085

Wulandari, R. T. (2016). Model Data Mining sebagai Prediksi Penyakit Hipertensi Kehamilan dengan Teknik Decision Tree. Scientific Journal of Informatics, 3(1), 19–26.

Xiao, S., Cheng, G., Yang, R., Zhang, Y., Lin, Y., & Ding, Y. (2021). Prediction on the number of confirmed Covid-19 with the FUDAN-CCDC mathematical model and its epidemiology, clinical manifestations, and prevention and treatment effects. Results in Physics, 20, 103618. https://doi.org/10.1016/j.rinp.2020.103618

Yoo, S. H., Geng, H., Chiu, T. L., Yu, S. K., Cho, D. C., Heo, J., Choi, M. S., Choi, I. H., Cung Van, C., Nhung, N. V., Min, B. J., & Lee, H. (2020). Deep Learning-Based Decision-Tree Classifier for COVID-19 Diagnosis From Chest X-ray Imaging. Frontiers in Medicine, 7, 1–8. https://doi.org/10.3389/fmed.2020.00427

Downloads

ARTICLE Published HISTORY

Submitted Date: 2021-12-11
Accepted Date: 2021-12-11
Published Date: 2022-01-02

How to Cite

Darwin, D., Christian, D., Chandra, W., & Nababan, M. (2022). Comparison of Decision Tree and Linear Regression Algorithms in the Case of Spread Prediction of COVID-19 in Indonesia. Journal of Computer Networks, Architecture and High Performance Computing, 4(1), 1-12. https://doi.org/10.47709/cnahpc.v4i1.1234