Comparations of Supervised Machine Learning Techniques in Predicting the Classification of the Household’s Welfare Status

Isi Artikel Utama

nfn Nofriani

Abstrak

Poverty has been a major problem for most countries around the world, including Indonesia. One approach to eradicate poverty is through equitable distribution of social assistance for target households based on Integrated Database of social assistance. This study has compared several well-known supervised machine learning techniques, namely: Naïve Bayes Classifier, Support Vector Machines, K-Nearest Neighbor Classification, C4.5 Algorithm, and Random Forest Algorithm to predict household welfare status classification by using an Integrated Database as a study case. The main objective of this study was to choose the best-supervised machine learning approach in predicting the classification of household’s welfare status based on attributes in the Integrated Database. The results showed that the Random Forest Algorithm was the best.

Rincian Artikel

Bagian
Informatika

Referensi

Anyanwu, M. N., & Shiva, S. G. (2009) Comparative Analysis of Serial Decision Tree Classification Algorithms. International Journal of Computer Science and Security (IJCSS), 3(3), 230-240.

Brownlee, J. (2016). Supervised and unsupervised machine learning algorithms. Machine Learning Mastery, 16(03). Retrieved from https://machinelearningmastery.com/supervised-and-unsupervised-machine-learning-algorithms/.

Chan, J. C. W., & Paelinckx, D. (2008). Evaluation of Random Forest and Adaboost tree-based ensemble classification and spectral band selection for ecotope mapping using airborne hyperspectral imagery. Remote Sensing of Environment, 112(6), 2999-3011.

Chauhan, H., & Chauhan, A. (2014). Evaluating Performance of Decision Tree Algorithms 1. International Journal of Scientific and Research Publication, 4(4), 1-2.

Defiyanti, S., & Pardede, D. L. (2010). Perbandingan kinerja Algoritma ID3 dan C4. 5 dalam klasifikasi spam-mail. Skripsi Program Studi Sistem Komputer. Depok: Universitas Gunadarma.

Duda, R. O., Hart, P. E., & Stork, D. G. (1995). Pattern Classification and Scene Analysis (2nd ed.). New York: John Wiley & Sons, Inc.

Google Developers. Classification: Precision and Recall. Machine Learning Crash Course. Retrieved October 3, 2018, from https://developers. google.com/machine-learning/crash-course/ classification/precision-and-recall.

Google Developers. Classification: ROC and AUC. Machine Learning Crash Course. Retrieved October 3, 2018, from https://developers.google.com/machine-learning/crash-course/classification/roc-and-auc.

Gorunescu, F. (2011). Data Mining: Concepts, models and techniques (Vol. 12). Springer Science & Business Media.

Hastuti, K. (2012). Analisis komparasi algoritma klasifikasi data mining untuk prediksi mahasiswa non aktif. Semantik, 2(1), 241-249. Retrieved from http://publikasi.dinus.ac.id/index.php/semantik/article/view/132

Iskandar, D., & Suprapto, Y. K. (2013). Perbandingan akurasi klasifikasi tingkat kemiskinan antara algoritma C4. 5 dan Naïve Bayes Clasifier. JAVA Journal of Electrical and Electronics Engineering, 11(1).

ames, G., Witten, D., Hastie, T., & Tibshirani, R. (2017). An Introduction to Statistical Learning: with Applications in R (Springer Text in Statistics). Springer.

Janardhanan, P., & Sabika, F. (2015). Effectiveness of Support Vector Machines in Medical Data Mining. Journal of Communications Software and Systems, 11(1), 25-30.

Karyadiputra, E. (2016). Analisis Algoritma Naive Bayes Untuk Klasifikasi Status Kesejahteraan Rumah Tangga Keluarga Binaan Sosial. Technologia: Jurnal Ilmiah, 7(4), 199-208.

Kataria, A., & Singh, M. D. (2013). A review of data classification using k-nearest neighbor algorithm. International Journal of Emerging Technology and Advanced Engineering, 3(6), 354-360.

Kaur, G., & Oberai, E. N. (2014). A review article on Naive Bayes classifier with various smoothing techniques. International Journal of Computer Science and Mobile Computing, 3(10), 864-868.

Kini, M.M., Devi, S.H., G Desai, P., Chiplunkar, N. (2015). Text Mining Approach to Classify Technical Research Documents using Naive Bayes. International Journal of Advanced Research in Computer and Communication Engineering, 4(7).

Li, M., Ma, L., Blaschke, T., Cheng, L., & Tiede, D. (2016). A systematic comparison of different object-based classification techniques using high spatial resolution imagery in agricultural environments. International Journal of Applied Earth Observation and Geoinformation, 49, 87-98.

Luque, I. F., Aguilar, F. J., Álvarez, M. F., & Aguilar, M. Á. (2013). Non-parametric object-based approaches to carry out ISA classification from archival aerial orthoimages. IEEE Journal of selected topics in applied earth observations and remote sensing, 6(4), 2058-2071.

Mitchell, R. S., Michalski, J. G., & Carbonell, T. M. (2013). An Artificial Intelligence Approach. Berlin: Springer

Murphy, K. P. (2006). Naive bayes classifiers. University of British Columbia, 18, 60.

Noviyanto, H. (2015). Pengklasifikasian Laman Web Berdasarkan Genre Menggunakan URL Feature. Seminar nasional Teknologi Informasi dan Komunikasi.

Podgorelec, V., Kokol, P., Stiglic, B., & Rozman, I. (2002). Decision trees: an overview and their use in medicine. Journal of medical systems, 26(5), 445-463.

Portugal, I., Alencar, P., & Cowan, D. (2018). The use of machine learning algorithms in recommender systems: A systematic review. Expert Systems with Applications, 97, 205-227.

Quinlan, J.R. (1986). Induction of Decision Trees. Machine Learning, 81-106. Kluwer Academic Publishers.

TNP2K. Tentang Basis Data Terpadu. Retrieved September 24, 2018 from http://bdt.tnp2k.go.id/tentang/.

Visa, S., Ramsay, B., Ralescu, A. L., & Van Der Knaap, E. (2011). Confusion Matrix-based Feature Selection. MAICS, 710, 120-127.

Waikato University. Weka 3: Data Mining Software in Java. Retrieved September 27, 2018, from https://www.cs.waikato.ac.nz/ml/weka/.

YouTube. (2017). Random Forest - Fun and Easy Machine Learning. Retrieved September 25, 2018, from https://www.youtube.com/watch?v=D_2LkhMJcfY.

YouTube. (2017). Support Vector Machine (SVM) - Fun and Easy Machine Learning. Retrieved September 25, 2018, from https://www.youtube.com/watch?v=Y6RRHw9uN9o.

YouTube. (2013). Weka Tutorial 28: ROC Curves and AUC (Model Evaluation). Retrieved October 3, 2018, from https://www.youtube.com/watch?v=j97h_-b0gvw.