Implementation of Data Mining for Speech Recognition Classification of Sundanese Dialect Using KNN Method with MFCC Feature Extraction
DOI:
10.47709/cnahpc.v6i3.4226Keywords:
K-Nearest Neighbor(K-NN), Data Mining, MFCC, Classification, SundaneseDimension Badge Record
Abstract
The importance of preservation and development of speech recognition technology for regional languages such as Sundanese, which have unique phonetic characteristics. Regional language speech recognition can assist in the development of local, educational, and cultural preservation applications to implement and evaluate the effectiveness of the combination of MFCC and KNN methods in classifying Sundanese dialect speech recognition. Methods used include trait extraction with MFCC, which converts voice data into numerical representations based on frequency characteristics, and classification with KNN, which groups data based on similarity to train data. The Dataset used consisted of speech recordings of Western and Southern Sundanese dialects. The results showed that the k-Nearest Neighbors (KNN) method can classify Sundanese dialect speech recognition with an accuracy of 80.00%, showing good ability in distinguishing "Western" and "southern" dialects. Mel-Frequency Cepstral Coefficients (MFCC) proved to be very effective in extracting sound features, helping KNN achieve low error rates. The combination of MFCC and KNN proved effective for speech recognition classification of Sundanese dialects, providing satisfactory results with high accuracy.
Downloads
Abstract viewed = 77 times
References
Adnan, F., Amelia, I., Sayyid ’, D., & Shiddiq, U. (2022). Implementasi Voice Recognition Berbasis Machine Learning. Edu Elektrika Journal, 11(1).
Ajinurseto, G., Bakrim, L. O., & Islamuddin, N. (2023). Penerapan Metode Mel Frequency Cepstral Coefficients pada Sistem Pengenalan Suara Berbasis Desktop. Infomatek, 25(1), 11–20. https://doi.org/10.23969/infomatek.v25i1.6109
Darwis, D., Siskawati, N., & Abidin, Z. (2021). Penerapan Algoritma Naive Bayes untuk Analisis Sentimen Review Data Twitter BMKG Nasional. 15(1).
Deski Prasetyo, P., Gede Pasek Suta Wijaya, I., & Yudo Husodo, A. (n.d.). KLASIFIKASI GENRE MUSIK MENGGUNAKAN METODE MEL FREQUENCY CEPSTRUM COEFFICIENTS (MFCC) DAN K-NEAREST NEIGHBORS CLASSIFIER (Classification of Music Genres Using The Mel-Frequency Cepstrum Coefficients (MFCC) and K-Nearest Neighbors Classifier Methods). http://jtika.if.unram.ac.id/index.php/JTIKA/
Dewi, S. P., Nurwati, N., & Rahayu, E. (2022). Penerapan Data Mining Untuk Prediksi Penjualan Produk Terlaris Menggunakan Metode K-Nearest Neighbor. Building of Informatics, Technology and Science (BITS), 3(4), 639–648. https://doi.org/10.47065/bits.v3i4.1408
Di, S., Bandung, K., Al, A., Ramadhani, F., Melga, B., & Nastiti, N. E. (2021). Perancangan Media Pembelajaran Interaktif Berbahasa Sunda Untuk Anak Pra.
Dwi, S., & Candra, P. (2021). KLASIFIKASI SUARA DENGAN EKSTRAKSI CIRI MEL FREQUENCY CEPSTRAL COEFFICIENTS MENGGUNAKAN MACHINE LEARNING.
Fadlila Surenggana, F., Aranta, A., & Bimantoro, F. (2022). KLASIFIKASI MOOD MUSIK MENGGUNAKAN K-NEAREST NEIGHBOR DENGAN MEL FREQUENCY CEPSTRAL COEFFICIENTS (Mood Music Classification using K-Nearest Neighbor with Mel Frequency Cepstral Coefficients). http://jtika.if.unram.ac.id/index.php/JTIKA/
Guntara, R. G., Nuryadin, A., & Hartanto, B. (2021). Pemanfaatan Google Speech to Text Untuk Aplikasi Pembelajaran Kamus Bahasa Sunda Pada Platform Mobile Android. 4(1), 10–19. https://doi.org/10.31764/justek.vXiY.ZZZ
Handayani, F. (n.d.). Aplikasi Data Mining Menggunakan Algoritma K-Means Clustering untuk Mengelompokkan Mahasiswa Berdasarkan Gaya Belajar. Jurnal Teknologi Dan Informasi. https://doi.org/10.34010/jati.v12i1
Indra Kusuma, A., Sularsa, A., & Zani, T. (n.d.). PEMBUATAN ASSET GAME EDUKASI BAHASA SUNDA “SI ASEP NYASAB DI LABIRIN” BERBASIS ANDROID.
Komalasari, N., Hidayat, E. W., & Aldya, A. P. (2022). APLIKASI PENGENALAN BAHASA SUNDA BERBASIS MULTIMEDIA (Vol. 9, Issue 1).
Musu, W., & Ibrahim, A. (n.d.). Pengaruh Komposisi Data Training dan Testing terhadap Akurasi Algoritma C4.5.
Nabila, Z., Rahman Isnain, A., & Abidin, Z. (2021). ANALISIS DATA MINING UNTUK CLUSTERING KASUS COVID-19 DI PROVINSI LAMPUNG DENGAN ALGORITMA K-MEANS. Jurnal Teknologi Dan Sistem Informasi (JTSI), 2(2), 100. http://jim.teknokrat.ac.id/index.php/JTSI
Putu, G., Widano, A., Agung, A., Ngurah, I., & Karyawati, E. (2022). Perintah Menggunakan Sinyal Suara dengan Mel-Frequency Cepstral Coefficients (MFCC) dan K-Nearest Neighbor (KNN). In JNATIA (Vol. 1, Issue 1).
Rizky, S. A., Yesputra, R., & Santoso, S. (2021). PREDIKSI KELANCARAN PEMBAYARAN CICILAN CALON DEBITUR DENGAN METODE K-NEAREST NEIGHBOR. JURTEKSI (Jurnal Teknologi Dan Sistem Informasi), 7(2), 195–202. https://doi.org/10.33330/jurteksi.v7i2.1078
Setio, P. B. N., Saputro, D. R. S., & Winarno, B. (2020). PRISMA, Prosiding Seminar Nasional Matematika Klasifikasi dengan Pohon Keputusan Berbasis Algoritme C4.5. 3, 64–71. https://journal.unnes.ac.id/sju/index.php/prisma/
Suparman, S. (2023). POSISI KEMUNCULAN VOKAL KONSONAN DALAM BAHASA RAMPI DAN BAHASA TAE’. Bahtera Indonesia; Jurnal Penelitian Bahasa Dan Sastra Indonesia, 8(2), 490–497. https://doi.org/10.31943/bi.v8i2.445
Yehezkiel, S. Y., & Suyanto, Y. (2022). Music Genre Identification Using SVM and MFCC Feature Extraction. IJEIS (Indonesian Journal of Electronics and Instrumentation Systems), 12(2), 115. https://doi.org/10.22146/ijeis.70898
Downloads
ARTICLE Published HISTORY
How to Cite
Issue
Section
License
Copyright (c) 2024 Ery Shandy, Abdul Halim Anshor, Dodit Ardiatma
This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License.