Optimizing Support Vector Machine Classification with SMOTE: Case Study of Alfagift Application User Reviews

Main Article Content

Adhani Mulianti
Yulison Chrisnanto
Herdi Ashaury

Abstract

Support Vector Machine (SVM) is a supervised learning algorithm that works by classifying based on classes that refer to patterns resulting from the training process. SVM has several commonly and popularly used kernels, one of which is the linear kernel. The weakness of SVM is in the "parameter selection" and its performance tends to be poor in the case of unbalanced datasets. The purpose of this study is to overcome the weaknesses of the SVM algorithm with the proposed method. This research uses a linear kernel with feature extraction that is Word2Vec with Skip-gram model, and in handling the data imbalance problem using SMOTE (oversampling) technique. The results showed that the unbalanced dataset produced an accuracy of 90% and the balanced dataset (SMOTE) produced an accuracy of 92%, so the SMOTE oversampling technique was proven to increase the accuracy results by 2%.

Article Details

Section
Informatics

References

Anam, M. K., Triyani, ;, Fitri, A., Agustin, ;, Lusiana, ;, Muhammad, ;, Firdaus, B., Agus, ;, & Nurhuda, T. (2023). Sentiment Analysis for Online Learning using The Lexicon-Based Method and The Support Vector Machine Algorithm. ILKOM Jurnal Ilmiah, 15(2), 290–302. http://dx.doi.org/10.33096/ilkom.v15i2.1590.290-302

Cervantes, J., Garcia-Lamont, F., Rodríguez-Mazahua, L., & Lopez, A. (2020). A comprehensive survey on support vector machine classification: Applications, challenges and trends. Neurocomputing, 408, 189–215. https://doi.org/10.1016/j.neucom.2019.10.118

Chong, K. S., & Shah, N. (2022). Comparison of Naive Bayes and SVM Classification in Grid-Search Hyperparameter Tuned and Non-Hyperparameter Tuned Healthcare Stock Market Sentiment Analysis. International Journal of Advanced Computer Science and Applications, 13(12), 90–94. https://doi.org/10.14569/IJACSA.2022.0131213

Dey, S., Wasif, S., Tonmoy, D. S., Sultana, S., Sarkar, J., & Dey, M. (2020). A Comparative Study of Support Vector Machine and Naive Bayes Classifier for Sentiment Analysis on Amazon Product Reviews. 2020 International Conference on Contemporary Computing and Applications, IC3A 2020, 217–220. https://doi.org/10.1109/IC3A48958.2020.233300

Guia, M., Silva, R. R., & Bernardino, J. (2019). Comparison of Naive Bayes, Support Vector Machine, Decision Trees and Random Forest on Sentiment Analysis. IC3K 2019 - Proceedings of the 11th International Joint Conference on Knowledge Discovery, Knowledge Engineering and Knowledge Management, 1(Ic3k), 525–531. https://doi.org/10.5220/0008364105250531

Kurniawan, F. W., & Maharani, W. (2020). Indonesian Twitter Sentiment Analysis Using Word2Vec. 2020 International Conference on Data Science and Its Applications, ICoDSA 2020, 31–36. https://doi.org/10.1109/ICoDSA50139.2020.9212906

Mujahid, M., Lee, E., Rustam, F., Washington, P. B., Ullah, S., Reshi, A. A., & Ashraf, I. (2021). Sentiment analysis and topic modeling on tweets about online education during covid-19. Applied Sciences (Switzerland), 11(18). https://doi.org/10.3390/app11188438

Nawangsari, R. P., Kusumaningrum, R., & Wibowo, A. (2019). Word2vec for Indonesian sentiment analysis towards hotel reviews: An evaluation study. Procedia Computer Science, 157, 360–366. https://doi.org/10.1016/j.procs.2019.08.178

Octaviani, K., Andayani Komara, M., & Kurniawan, I. (2022). Analisis Kesuksesan Aplikasi Alfagift Menggunakan Model Delon Dan Mclean Studi Kasus Alfa Express Rest Area Km 72B. Jurnal Informatika, Teknologi Dan Sains, 4(3), 173–178. https://doi.org/10.51401/jinteks.v4i3.1946

Sheik Abdullah, A., Akash, K., ShaminThres, J., & Selvakumar, S. (2021). Sentiment Analysis of Movie Reviews Using Support Vector Machine Classifier with Linear Kernel Function. Advances in Intelligent Systems and Computing, 1176, 345–354. https://doi.org/10.1007/978-981-15-5788-0_34

Sohrabi, M. K., & Hemmatian, F. (2019). An efficient preprocessing method for supervised sentiment analysis by converting sentences to numerical vectors: a twitter case study. Multimedia Tools and Applications, 78(17), 24863–24882. https://doi.org/10.1007/s11042-019-7586-4

Styawati, S., Nurkholis, A., Aldino, A. A., Samsugi, S., Suryati, E., & Cahyono, R. P. (2022). Sentiment Analysis on Online Transportation Reviews Using Word2Vec Text Embedding Model Feature Extraction and Support Vector Machine (SVM) Algorithm. 2021 International Seminar on Machine Learning, Optimization, and Data Science, ISMODE 2021, February 2023, 163–167. https://doi.org/10.1109/ISMODE53584.2022.9742906

Wahyudi, R., & Kusumawardana, G. (2021). Analisis Sentimen pada Review Aplikasi Grab di Google Play Store Menggunakan Support Vector Machine. Jurnal Informatika, 8(2), 200–207. https://doi.org/10.31294/ji.v8i2.9681