Analysis On Internet Pattern of Youtube Browsing in Indonesia Using Web Crawling and Unsupervised Learning (Analisis Pola Minat Tayangan Youtube DI Indonesia dengan Web Crawling dan Supervised Learning)
Isi Artikel Utama
Abstrak
YouTube is a popular video sharing website, specifically in Indonesia. Every day, in every country, the list of trending videos is updated on YouTube’s Trending page. The data of trending videos can be used for information exploration, such as analysis on the pattern of interests of YouTube browsing. This research aims to grab and analyse the metadata of trending videos to generate a classifier model and statistics of trending YouTube videos in Indonesia. The data is grabbed from YouTube’s Trending page using Scraper and Screaming Frog SEO Spider tools, every day for 10 consecutive days. The data is later classified into video categories. The approach used for this purpose is rule-based classification using J48 tree algorithm and TF-IDF filter. The result of this research shows that videos about people, blogs, sports, news, politics, comedy, entertainment and music are what interest the people in Indonesia the most.
Rincian Artikel
Jurnal IPTEK-KOM menggunakan kebijakan akses terbuka. Syarat yang harus dipenuhi oleh Penulis sebagai berikut:
- Penulis menyimpan hak cipta dan memberikan jurnal hak penerbitan pertama naskah secara simultan dengan lisensi di bawah Creative Commons Attribution License yang mengizinkan orang lain untuk berbagi pekerjaan dengan sebuah pernyataan kepenulisan pekerjaan dan penerbitan awal di jurnal ini.
- Penulis bisa memasukkan ke dalam penyusunan kontraktual tambahan terpisah untuk distribusi non ekslusif versi kaya terbitan jurnal (contoh: mempostingnya ke repositori institusional atau menerbitkannya dalam sebuah buku), dengan pengakuan penerbitan awalnya di jurnal ini.
- Penulis diizinkan dan didorong untuk mem-posting karya mereka online (contoh: di repositori institusional atau di website mereka) sebelum dan selama proses penyerahan, karena dapat mengarahkan ke pertukaran produktif, seperti halnya sitiran yang lebih awal dan lebih hebat dari karya yang diterbitkan. (Lihat Efek Akses Terbuka).
Referensi
Afonso, Alexandre Ribeiro and Claudio Gottschalg Duque. “Automated Text Clustering of Newspaper and Scientific Texts in Brazilian Protuguese: Analysis and Comparison of Methods.” JISTEM – Journal of Information Systems and Technology Management. Brazil: University of Sao Paulo, 2014.
Backlinko. “We Analyzed 1.3 Million YouTube Videos. Here’s Wat We Learned About YouTube SEO.” Accessed on March 3rd, 2018.
https://backlinko.com/youtube-ranking-factors/
GCF Global. “What is YouTube?”. Accessed on March, 2nd 2018.
https://www.gcflearnfree.org/youtube/what-is-youtube/1/. (Official website)
Fitri, Meisya. Perancangan Sistem Temu Balik Informasi dengan Metode Pembobotan Kombiasi TF-IDF untuk Pencarian Dokumen Berbahasa Indonesia. Semarang: Universitas Tanjungpura, 2013.
Herwijayanti, Bening, et al. Klasifikasi Berita Online dengan Menggunakan Pembobotan TF-IDF dan Cosine Similarity. Jurnal Pengembangan Teknologi Informasi dan Komputer. Malang: Universitas Brawijaya, 2018.
Hootsuite Media. Indonesia Digital Landscape 2018, 2018. Accessed on February 8th, 2018. https://hootsuite.com/resources/digital-in-2018-apac/
Kawade, Dipak Ramchandra and Kavita S.Oza. “News Classification: A Data Mining Approach.” Indian Journal of Science and Technology. India: Indian Society of Education and Environment, 2016.
Langgeni, Diah Pudi, et al. “Clustering Artikel Berita Bahasa Indonesia Menggunakan Unsupervised Feature Selection.” Seminar Nasional Informatika. Yogyakarta: UPN, 2010.
Lo, Tsz-wai Rachel, et al. “Automatically Building A Stopword List for An Information Retrieval System.” Glasgow: University of Glasgow, 2005.
Loria, Steven. “Tutorial: Finding Important Words in Text Using TF-IDF”. 2013. Accessed on March 15th, 2018.
https://stevenloria.com/tf-idf/
Nirmaldasan. “The Average Sentence Length.” 2008. Accessed on March 15th, 2018.
https://strainindex.wordpress.com/2008/07/28/the-average-sentence-length/
Noviyanto, Hendri, et al. “Pengklasifikasian Laman Web Berdasarkan Genre Menggunakan URL Feature.” Seminar Nasional Teknologi Informasi dan Komunikasi. Yogyakarta: UGM, 2015.
Potthast, Martin, et al. “Clickbait Detection.” Proceedings of the 25th ACM International Conference on Information and Knowledge Management (CIKM 16). The Netherlands: Bauhaus Universität Weimar, 2016.
Sharma, Priyanka. “Comparative Analysis of Various Decision Tree Classification Algorithms using Weka.” International Journal on Recentand Innovation Trends in Computing and Communication Volume: 3 Issue: 2. India: Auricle Technologies Pvt. Ltd, 2015.
Suadaa, Lya Hulliyyatus. “Tracking Commuter Train Intrusion Through Twitter Crawling.” Jurnal Aplikasi Statistika dan Komputasi Statistik. Jakarta: Politeknik Statistika STIS, 2016.
Waikato University. “Weka 3: Data Mining Software.” Accessed on March 3rd, 2018.
https://www.cs.waikato.ac.nz/ml/weka/
Witten, Ian H. Text Mining. New Zealand: Waikato University, 2004.
YouTube. YouTube Lesson: Video Categories. Accessed on March 5th, 2018.
https://creatoracademy.youtube.com/page/lesson/overview-categories