ac

Enhancing Multi-Layer Perceptron Performance with K-Means Clustering

Authors

  • Doughlas Pardede Universitas Deli Sumatera
  • Aulia Ichsan Universitas Deli Sumatera
  • Sugeng Riyadi Universitas Deli Sumatera

DOI:

10.47709/cnahpc.v6i1.3600

Keywords:

Classification, K-Means, MLP, Overfitting, Preprocessing

Dimension Badge Record



Abstract

Machine learning plays a crucial role in identifying patterns within data, with classification being a prominent application. This study investigates the use of Multilayer Perceptron (MLP) classification models and explores preprocessing techniques, particularly K-Means clustering, to enhance model performance. Overfitting, a common challenge in MLP models, is addressed through the application of K-Means clustering to streamline data preparation and improve classification accuracy. The study begins with an overview of overfitting in MLP models, highlighting the significance of mitigating this issue. Various techniques for addressing overfitting are reviewed, including regularization, dropout, early stopping, data augmentation, and ensemble methods. Additionally, the complementary role of K-Means clustering in enhancing model performance is emphasized. Preprocessing using K-Means clustering aims to reduce data complexity and prevent overfitting in MLP models. Three datasets - Iris, Wine, and Breast Cancer Wisconsin - are employed to evaluate the performance of K-Means as a preprocessing technique. Results from cross-validation demonstrate significant improvements in accuracy, precision, recall, and F1 scores when employing K-Means clustering compared to models without preprocessing. The findings highlight the efficacy of K-Means clustering in enhancing the discriminative power of MLP classification models by organizing data into clusters based on similarity. These results have practical implications, underlining the importance of appropriate preprocessing techniques in improving classification performance. Future research could explore additional preprocessing methods and their impact on classification accuracy across diverse datasets, advancing the field of machine learning and its applications

Downloads

Download data is not yet available.
Google Scholar Cite Analysis
Abstract viewed = 158 times

References

Abijono, H., Santoso, P., & Anggreini, N. L. (2021). Supervised Learning and Unsupervised Learning Algorithms in Data Processing. Jurnal Teknologi Terapan: G-Tech, 4(2), 315–318. https://doi.org/10.33379/gtech.v4i2.635

Al-Yaseen, W. L., Jehad, A., Abed, Q. A., & Idrees, A. K. (2021). The Use of Modified K-Means Algorithm to Enhance the Performance of Support Vector Machine in Classifying Breast Cancer. International Journal of Intelligent Engineering and Systems, 14(2), 190. https://doi.org/10.22266/ijies2021.0430.17

Andreoni Lopez, M., Mattos, D. M. F., Duarte, O. C. M. B., & Pujolle, G. (2019). A fast unsupervised preprocessing method for network monitoring. Annales Des Telecommunications/Annals of Telecommunications, 74(3–4), 139–155. https://doi.org/10.1007/s12243-018-0663-2

Arvanitidis, A. I., Bargiotas, D., Daskalopulu, A., Kontogiannis, D., Panapakidis, I. P., & Tsoukalas, L. H. (2022). Clustering Informed MLP Models for Fast and Accurate Short-Term Load Forecasting. Energies, 15(4), 1–14. https://doi.org/10.3390/en15041295

Asad, R., Arooj, S., & Rehman, S. U. (2022). Study of Educational Data Mining Approaches for Student Performance Analysis. Technical Journal, 27(1), 68-81. https://www.researchgate.net/publication/362762123_Study_of_Educational_Data_Mining_Approaches_for_Student_Performance_Analysis

Bahtiyar, H., Soydaner, D., & Yüksel, E. (2022). Application of multilayer perceptron with data augmentation in nuclear physics. Applied Soft Computing, 128(August). https://doi.org/10.1016/j.asoc.2022.109470

Dovbnych, M., & Plechawska–Wójcik, M. (2021). A comparison of conventional and deep learning methods of image classification. Journal of Computer Sciences Institute, 21(September), 303–308. https://doi.org/10.35784/jcsi.2727

Fayaz, M., Khan, A., Rahman, J. U., Alharbi, A., Uddin, M. I., & Alouffi, B. (2020). Ensemble machine learning model for classification of spam product reviews. Complexity, 2020. https://doi.org/10.1155/2020/8857570

Firmansyah, I., & Hayadi, B. H. (2022). Komparasi Fungsi Aktivasi Relu Dan Tanh Pada Multilayer Perceptron. JIKO (Jurnal Informatika Dan Komputer), 6(2), 200. https://doi.org/10.26798/jiko.v6i2.600

Firmansyah, I., & Rosnelly, R. (2023). Inception-V3 Versus VGG-16?: in Rice Classification Using Multilayer Perceptron. 2nd International Conference on Information Science and Technology Innovatin (ICoSTEC), 2(1), 1–5. https://prosiding-icostec.respati.ac.id/index.php/icostec/article/view/24

Kolluri, J., Kotte, V. K., Phridviraj, M. S. B., & Razia, S. (2020). Reducing Overfitting Problem in Machine Learning Using Novel L1/4 Regularization Method. 2020 4th International Conference on Trends in Electronics and Informatics (ICOEI)(48184), June, 934–938. https://doi.org/10.1109/ICOEI48184.2020.9142992

Li, T., Zhuang, Z., Liang, H., Peng, L., Wang, H., & Sun, J. (2021). Self-Validation: Early Stopping for Single-Instance Deep Generative Priors. 32nd British Machine Vision Conference, BMVC 2021, 1–14.

Margolang, K. F., Riyadi, S., Rosnelly, R., & Wanayumini. (2023). Pengenalan Masker Wajah Menggunakan VGG-16 dan Multilayer Perceptron. Jurnal Telematika, 17(2), 80–87.

Maturo, F., & Verde, R. (2024). Combining unsupervised and supervised learning techniques for enhancing the performance of functional data classifiers. Computational Statistics, 39(1), 239–270. https://doi.org/10.1007/s00180-022-01259-8

Mondal, R., Dey, P., Sharma, G., & Pal, T. (2020). Regularizing Multilayer Perceptron for Generalization Using KL-Divergence. 2020 International Conference on Computer Science, Engineering and Applications, ICCSEA 2020. https://doi.org/10.1109/ICCSEA49143.2020.9132891

Pardede, D., & Hayadi, B. H. (2023). Klasifikasi Sentimen Terhadap Gelaran MotoGP Mandalika 2022 Menggunakan Machine Learning. Jurnal TRANSFORMATIKA, 20(2), 42–50.

Pawluszek-Filipiak, K., & Borkowski, A. (2020). On the importance of train-test split ratio of datasets in automatic landslide detection by supervised classification. Remote Sensing, 12(18), 0–32. https://doi.org/10.3390/rs12183054

Piotrowski, A. P., Napiorkowski, J. J., & Piotrowska, A. E. (2020). Impact of deep learning-based dropout on shallow neural networks applied to stream temperature modelling. Earth-Science Reviews, 201(August 2019), 103076. https://doi.org/10.1016/j.earscirev.2019.103076

Rynkiewicz, J. (2019). On overfitting of multilayer perceptrons for classification. ESANN 2019 - Proceedings, 27th European Symposium on Artificial Neural Networks, Computational Intelligence and Machine Learning, April, 257–262.

Suwirya, I. P., Candiasa, I. M., & Dantes, G. R. (2022). Evaluation of ATM Location Placement Using the K-Means Clustering in BNI Denpasar Regional Office. Journal of Computer Networks, Architecture and High Performance Computing, 4(2), 158–168. https://doi.org/10.47709/cnahpc.v4i2.1580

Tarigan, N. M. B., Tarigan, S. E. B., & Simatupang, A. P. (2023). Implementation of Data Mining in Grouping Data of the Poor Using the K-Means Method. Journal of Computer Networks, Architecture and High Performance Computing, 5(2), 599–611. https://doi.org/10.47709/cnahpc.v5i2.2625

Usman, D., & Stores, F. S. (2020). On Some Data Pre-processing Techniques for K-Means Clustering Algorithm. Journal of Physics: Conference Series, 1489(1). https://doi.org/10.1088/1742-6596/1489/1/012029

Walid, M., Sahbaniya, N. L. N., Hozairi, H., Baskoro, F., & Wijaya, A. Y. (2023). K-Means Clustering and Multilayer Perceptron for Categorizing Student Business Groups. Knowledge Engineering and Data Science, 6(1), 69. https://doi.org/10.17977/um018v6i12023p69-78

Werner de Vargas, V., Schneider Aranda, J. A., dos Santos Costa, R., da Silva Pereira, P. R., & Victória Barbosa, J. L. (2023). Imbalanced data preprocessing techniques for machine learning: a systematic mapping study. Knowledge and Information Systems, 65(1), 31–57. https://doi.org/10.1007/s10115-022-01772-8

Yang, M. S., & Hussain, I. (2023). Unsupervised Multi-View K-Means Clustering Algorithm. IEEE Access, 11, 13574–13593. https://doi.org/10.1109/ACCESS.2023.3243133

Ying, X. (2019). An Overview of Overfitting and its Solutions. Journal of Physics: Conference Series, 1168(2). https://doi.org/10.1088/1742-6596/1168/2/022022

Downloads

ARTICLE Published HISTORY

Submitted Date: 2024-02-15
Accepted Date: 2024-02-15
Published Date: 2024-02-18

How to Cite

Pardede, D., Ichsan, A., & Riyadi, S. (2024). Enhancing Multi-Layer Perceptron Performance with K-Means Clustering. Journal of Computer Networks, Architecture and High Performance Computing, 6(1), 461-466. https://doi.org/10.47709/cnahpc.v6i1.3600