Optimasi Klasifikasi Sentimen pada Komentar Online menggunakan Multinomial Naïve Bayes dan Ekstraksi Fitur TF-IDF serta N-grams

Isi Artikel Utama

Alfin Gerliandeva
Yulison Chrisnanto
Herdi Ashaury

Abstrak

Algoritma Naïve Bayes (NB) merupakan metode pengklasifikasi yang menghitung probabilitas sederhana dan cocok digunakan untuk klasifikasi teks salah satunya dalam konteks analisis sentimen. Varian klasik NB adalah Multinomial Naïve Bayes (MNB). Kelemahan algoritma MNB adalah asumsi independensi terhadap fitur. Penelitian ini menggunakan dataset komentar dan ulasan dari berbagai platform online. Penelitian ini menggunakan metode yang diusulkan dalam menangani kelemahan dari algoritma MNB yaitu penggunaan ekstraksi fitur TF-IDF dan N-grams (1-gram sampai 5-gram), dan penggunaan seleksi fitur Chi-Square, serta menangani ketidakseimbangan dataset menggunakan SMOTE (metode oversampling dan undersampling). Hasil penelitian ini menunjukkan bahwa penggunaan pentagram (5-gram) dengan data yang telah dilakukan oversampling SMOTE menghasilkan nilai akurasi tertinggi sebesar 94% dan nilai Area Under Curve (AUC) sebesar 100%.

Rincian Artikel

Bagian
Informatika

Referensi

Abbas, M., Ali, K., Jamali, A., Ali Memon, K., & Aleem Jamali, A. (2019). Multinomial Naive Bayes Classification Model for Sentiment Analysis. IJCSNS International Journal of Computer Science and Network Security, 19(3), 62. https://doi.org/10.13140/RG.2.2.30021.40169

Amal, M. I., Rahmasita, E. S., Suryaputra, E., & Rakhmawati, N. A. (2022). Analisis Klasifikasi Sentimen Terhadap Isu Kebocoran Data Kartu Identitas Ponsel di Twitter. Jurnal Teknik Informatika Dan Sistem Informasi, 8(3), 645–660. https://doi.org/10.28932/jutisi.v8i3.5483

Anam, M. K., Triyani, ;, Fitri, A., Agustin, ;, Lusiana, ;, Muhammad, ;, Firdaus, B., Agus, ;, & Nurhuda, T. (2023). Sentiment Analysis for Online Learning using The Lexicon-Based Method and The Support Vector Machine Algorithm. ILKOM Jurnal Ilmiah, 15(2), 290–302. http://dx.doi.org/10.33096/ilkom.v15i2.1590.290-302

Ernayanti, T., Mustafid, M., Rusgiyono, A., & Hakim, A. R. (2023). Penggunaan Seleksi Fitur Chi-Square Dan Algoritma Multinomial Naïve Bayes Untuk Analisis Sentimen Pelangggan Tokopedia. Jurnal Gaussian, 11(4), 562–571. https://doi.org/10.14710/j.gauss.11.4.562-571

Farisi, A. A., Sibaroni, Y., & Faraby, S. Al. (2019). Sentiment analysis on hotel reviews using Multinomial Naïve Bayes classifier. Journal of Physics: Conference Series, 1192(1). https://doi.org/10.1088/1742-6596/1192/1/012024

Handayani, Y., Hakim, A. R., & Muljono. (2020). Sentiment analysis of Bank BNI user comments using the support vector machine method. Proceedings - 2020 International Seminar on Application for Technology of Information and Communication: IT Challenges for Sustainability, Scalability, and Security in the Age of Digital Disruption, ISemantic 2020, 202–207. https://doi.org/10.1109/iSemantic50169.2020.9234230

Hossain, E., Sharif, O., & Hoque, M. M. (n.d.). Book Reviews Using Multinomial Naïve Bayes.

Prastyo, P. H., Ardiyanto, I., & Hidayat, R. (2020). Indonesian Sentiment Analysis: An Experimental Study of Four Kernel Functions on SVM Algorithm with TF-IDF. 2020 International Conference on Data Analytics for Business and Industry: Way Towards a Sustainable Economy, ICDABI 2020. https://doi.org/10.1109/ICDABI51230.2020.9325685

Purwarianti, A., & Crisdayanti, I. A. P. A. (2019). Improving Bi-LSTM Performance for Indonesian Sentiment Analysis Using Paragraph Vector. Proceedings - 2019 International Conference on Advanced Informatics: Concepts, Theory, and Applications, ICAICTA 2019. https://doi.org/10.1109/ICAICTA.2019.8904199

Sholehurrohman, R., & Sabda Ilman, I. (2022). Analisis Sentimen Tweet Kasus Kebocoran Data Penggunaan Facebook Oleh Cambrigde Analytica. Jurnal Pepadun, 3(1), 140–147. https://doi.org/10.23960/pepadun.v3i1.108

Singh, G., Kumar, B., Gaur, L., & Tyagi, A. (2019). Comparison between Multinomial and Bernoulli Naïve Bayes for Text Classification. 2019 International Conference on Automation, Computational and Technology Management, ICACTM 2019, 593–596. https://doi.org/10.1109/ICACTM.2019.8776800

Surya, P. P. M., Seetha, L. V., & Subbulakshmi, B. (2019). Analysis of user emotions and opinion using Multinomial Naive Bayes Classifier. Proceedings of the 3rd International Conference on Electronics and Communication and Aerospace Technology, ICECA 2019, 410–415. https://doi.org/10.1109/ICECA.2019.8822096

Taufiqi, A. M., & Nugroho, A. (2023). Sentimen Pengguna Twitter Mengenai Isu Kebocoran Data Dengan Algoritma Naïve Bayes. Jurnal Nasional Ilmu Komputer, 4(1), 1–11. https://doi.org/10.47747/jurnalnik.v4i1.1091

Wibowo, N. I., Maulana, T. A., Muhammad, H., & Rakhmawati, N. A. (2021). Perbandingan Algoritma Klasifikasi Sentimen Twitter Terhadap Insiden Kebocoran Data Tokopedia. JISKA (Jurnal Informatika Sunan Kalijaga), 6(2), 120–129. https://doi.org/10.14421/jiska.2021.6.2.120-129

Zul, M. I., Yulia, F., & Nurmalasari, D. (2018). Social media sentiment analysis using K-means and naïve bayes algorithm. Proceedings - 2018 2nd International Conference on Electrical Engineering and Informatics: Toward the Most Efficient Way of Making and Dealing with Future Electrical Power System and Big Data Analysis, ICon EEI 2018, October, 24–29. https://doi.org/10.1109/ICon-EEI.2018.8784326