Evaluation of Indonesian Language Stemmer Algorithms: A Comparative Analysis
DOI:
https://doi.org/10.47709/brilliance.v5i1.5679Keywords:
stemming, method, Bahasa Indonesia, algorithm, reviewAbstract
Indonesian is a language with a large number of speakers and diverse vocabulary. One of the main challenges of Indonesian language processing is the presence of agglutinative morphology. This complexity makes it challenging for traditional stemming algorithms developed for European languages to accurately handle Indonesian words. This review focuses on several prominent Indonesian text processing algorithms that have been developed specifically for Indonesian, highlighting the contributions made by Nazief and Adriani, Asian, Arifin and Setiono, and the Enhanced Confix Stripping (ECS) stemmer. By examining these algorithms, we can better understand their methodologies, efficacy, and applications. The results of the study revealed that the ECS stemmer outperformed the other algorithms in terms of accuracy and efficiency. The ECS algorithm was able to strip affixes more effectively and accurately identify the root form of words, leading to improved text analysis and information retrieval. As linguistic technology continues to evolve, ongoing research into these methods will be crucial for advancing our ability to process Indonesian texts accurately and effectively.
References
Adriani, M., Asian, J., Nazief, B., Williams, H. E., & Tahaghoghi, S. M. M. (2007). Stemming Indonesian. Conferences in Research and Practice in Information Technology Series, 38(4), 307–314. https://doi.org/10.1145/1316457.1316459
Arif Siswandi, A., Permana, Y., & Emarilis, A. (2021). Stemming Analysis Indonesian Language News Text with Porter Algorithm. Journal of Physics: Conference Series, 1845(1). https://doi.org/10.1088/1742-6596/1845/1/012019
Arifin, A. Z., Adhi, P., Mahendra, K., & Ciptaningtyas, H. T. (2009). ENHANCED CONFIX STRIPPING STEMMER AND ANTS ALGORITHM FOR CLASSIFYING NEWS DOCUMENT IN INDONESIAN LANGUAGE.
Fitriana, L. A., Mustopa, A., Firdaus, M. R., & Dahlia, R. (2024). Sistemasi: Jurnal Sistem Informasi Application of the Finite State Automata (FSA) Method in Indonesian Stemming using the Nazief & Adriani Algorithm (Vol. 13, Issue 3). http://sistemasi.ftik.unisi.ac.id
Kusuma Wardana, H., Swanita, I., & Yohanes, B. W. (2019). Sistem Pemeriksa Pola Kalimat Bahasa Indonesia berbasis Algoritme Left-Corner Parsing dengan Stemming. In JNTETI (Vol. 8, Issue 3).
Mulyana, I., Suhendra, A., Ernastuti, & Bheta Agus, W. (2019). Development of indonesian stemming algorithms through modification of grouping, sequencing and removing of affixes based on morphophonemic. International Journal of Recent Technology and Engineering, 8(2 Special Issue 7), 179–184. https://doi.org/10.35940/ijrte.B1044.0782S719
Putra, S. J., Cahyanti, N. D., Ratnawati, S., Gunawan, M. N., & Sari, D. P. (2019). The Implementation of Indonesian Stemming System for Indonesian Translation of the Quran. EUDL Digital Library.
Rifai, W. A. (2019). MODIFIKASI ALGORITME STEMMING MENGGUNAKAN PENDEKATAN NON DETERMINISTIK UNTUK TEKS BAHASA INDONESIA. http://etd.repository.ugm.ac.id/
Rizki, A. S., Aristi, N. M., Ridha, N., Zulfahri, A. F., & Wibowo, D. A. (2023). Implementation of The Indonesian Language Stemming Algorithm in Twitter Data Preprocessing. Case Study: Twitter Wargabanua and Instakalsel. Fidelity?: Jurnal Teknik Elektro, 5(3), 175–183. https://doi.org/10.52005/fidelity.v5i3.170
Rumaisa, F., Basiron, H., & Saaya, Z. (2019). Development of multilingual social media data corpus for sentiment classification. Journal of Advanced Research in Dynamical and Control Systems.
Suci, F. W., Hayatin, N., & Munarko, Y. (2022). IN-IDRIS: MODIFICATION OF IDRIS STEMMING ALGORITHM FOR INDONESIAN TEXT. IIUM Engineering Journal, 23(1), 82–94. https://doi.org/10.31436/IIUMEJ.V23I1.1783
Winarti, T., Kerami, D., Lussiana, E. +, & Sudiro, S. A. (2017). Improving Stemming Algorithm Using Morphological Rules. 7(5).
Wisuda Sardjono, M., Cahyanti, M., Mujahidin, M., & Arianty, R. (2018). PENDETEKSI KESAMAAN KATA UNTUK JUDUL PENULISAN BERBAHASA INDONESIA MENGGUNAKAN ALGORITMA STEMMING NAZIEF-ADRIANI.
Yudi, A., & Makmun, D. M. (n.d.). Optimasi Stemming Porter KBBI dan Cross Validation Na¨?veNa¨?ve Bayes untuk Klasifikasi Topik Soal UN (Ujian Nasional) Bahasa Indonesia.
Zainal, A., Dan, A., & Setiono, A. N. (n.d.). Klasifikasi Dokumen Berita Kejadian Berbahasa Indonesia dengan Algoritma Single Pass Clustering.
Downloads
Published
How to Cite
Issue
Section
License
Copyright (c) 2025 Fitrah Rumaisa

This work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License.