Optimizing Support Vector Machine Classification with SMOTE: Case Study of Alfagift Application User Reviews
Main Article Content
Abstract
Support Vector Machine (SVM) is a supervised learning algorithm that works by classifying based on classes that refer to patterns resulting from the training process. SVM has several commonly and popularly used kernels, one of which is the linear kernel. The weakness of SVM is in the "parameter selection" and its performance tends to be poor in the case of unbalanced datasets. The purpose of this study is to overcome the weaknesses of the SVM algorithm with the proposed method. This research uses a linear kernel with feature extraction that is Word2Vec with Skip-gram model, and in handling the data imbalance problem using SMOTE (oversampling) technique. The results showed that the unbalanced dataset produced an accuracy of 90% and the balanced dataset (SMOTE) produced an accuracy of 92%, so the SMOTE oversampling technique was proven to increase the accuracy results by 2%.
Article Details
This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License.
The proposed policy for journals that offer open access
Authors who publish with this journal agree to the following terms:
- Copyright on any article is retained by the author(s).
- Author grant the journal, right of first publication with the work simultaneously licensed under a Creative Commons Attribution License that allows others to share the work with an acknowledgement of the work’s authorship and initial publication in this journal.
- Authors are able to enter into separate, additional contractual arrangements for the non-exclusive distribution of the journal’s published version of the work (e.g., post it to an institutional repository or publish it in a book), with an acknowledgement of its initial publication in this journal.
- Authors are permitted and encouraged to post their work online (e.g., in institutional repositories or on their website) prior to and during the submission process, as it can lead to productive exchanges, as well as earlier and greater citation of published work.
- The article and any associated published material is distributed under the Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License
References
Anam, M. K., Triyani, ;, Fitri, A., Agustin, ;, Lusiana, ;, Muhammad, ;, Firdaus, B., Agus, ;, & Nurhuda, T. (2023). Sentiment Analysis for Online Learning using The Lexicon-Based Method and The Support Vector Machine Algorithm. ILKOM Jurnal Ilmiah, 15(2), 290–302. http://dx.doi.org/10.33096/ilkom.v15i2.1590.290-302
Cervantes, J., Garcia-Lamont, F., Rodríguez-Mazahua, L., & Lopez, A. (2020). A comprehensive survey on support vector machine classification: Applications, challenges and trends. Neurocomputing, 408, 189–215. https://doi.org/10.1016/j.neucom.2019.10.118
Chong, K. S., & Shah, N. (2022). Comparison of Naive Bayes and SVM Classification in Grid-Search Hyperparameter Tuned and Non-Hyperparameter Tuned Healthcare Stock Market Sentiment Analysis. International Journal of Advanced Computer Science and Applications, 13(12), 90–94. https://doi.org/10.14569/IJACSA.2022.0131213
Dey, S., Wasif, S., Tonmoy, D. S., Sultana, S., Sarkar, J., & Dey, M. (2020). A Comparative Study of Support Vector Machine and Naive Bayes Classifier for Sentiment Analysis on Amazon Product Reviews. 2020 International Conference on Contemporary Computing and Applications, IC3A 2020, 217–220. https://doi.org/10.1109/IC3A48958.2020.233300
Guia, M., Silva, R. R., & Bernardino, J. (2019). Comparison of Naive Bayes, Support Vector Machine, Decision Trees and Random Forest on Sentiment Analysis. IC3K 2019 - Proceedings of the 11th International Joint Conference on Knowledge Discovery, Knowledge Engineering and Knowledge Management, 1(Ic3k), 525–531. https://doi.org/10.5220/0008364105250531
Kurniawan, F. W., & Maharani, W. (2020). Indonesian Twitter Sentiment Analysis Using Word2Vec. 2020 International Conference on Data Science and Its Applications, ICoDSA 2020, 31–36. https://doi.org/10.1109/ICoDSA50139.2020.9212906
Mujahid, M., Lee, E., Rustam, F., Washington, P. B., Ullah, S., Reshi, A. A., & Ashraf, I. (2021). Sentiment analysis and topic modeling on tweets about online education during covid-19. Applied Sciences (Switzerland), 11(18). https://doi.org/10.3390/app11188438
Nawangsari, R. P., Kusumaningrum, R., & Wibowo, A. (2019). Word2vec for Indonesian sentiment analysis towards hotel reviews: An evaluation study. Procedia Computer Science, 157, 360–366. https://doi.org/10.1016/j.procs.2019.08.178
Octaviani, K., Andayani Komara, M., & Kurniawan, I. (2022). Analisis Kesuksesan Aplikasi Alfagift Menggunakan Model Delon Dan Mclean Studi Kasus Alfa Express Rest Area Km 72B. Jurnal Informatika, Teknologi Dan Sains, 4(3), 173–178. https://doi.org/10.51401/jinteks.v4i3.1946
Sheik Abdullah, A., Akash, K., ShaminThres, J., & Selvakumar, S. (2021). Sentiment Analysis of Movie Reviews Using Support Vector Machine Classifier with Linear Kernel Function. Advances in Intelligent Systems and Computing, 1176, 345–354. https://doi.org/10.1007/978-981-15-5788-0_34
Sohrabi, M. K., & Hemmatian, F. (2019). An efficient preprocessing method for supervised sentiment analysis by converting sentences to numerical vectors: a twitter case study. Multimedia Tools and Applications, 78(17), 24863–24882. https://doi.org/10.1007/s11042-019-7586-4
Styawati, S., Nurkholis, A., Aldino, A. A., Samsugi, S., Suryati, E., & Cahyono, R. P. (2022). Sentiment Analysis on Online Transportation Reviews Using Word2Vec Text Embedding Model Feature Extraction and Support Vector Machine (SVM) Algorithm. 2021 International Seminar on Machine Learning, Optimization, and Data Science, ISMODE 2021, February 2023, 163–167. https://doi.org/10.1109/ISMODE53584.2022.9742906
Wahyudi, R., & Kusumawardana, G. (2021). Analisis Sentimen pada Review Aplikasi Grab di Google Play Store Menggunakan Support Vector Machine. Jurnal Informatika, 8(2), 200–207. https://doi.org/10.31294/ji.v8i2.9681