KLASIFIKASI SENTIMEN BERBASIS MACHINE LEARNING TERHADAP PENONTON “WONDERLAND INDONESIA” PADA PLATFORM YOUTUBE
DOI:
https://doi.org/10.54314/jssr.v9i2.6270Keywords:
Sentiment Analysis, Naive Bayes, Support Vector Machine, YouTube, TF-IDFAbstract
Abstract: The purpose of this study is to analyze and classify audience sentiment toward the Wonderland Indonesia video on the YouTube platform using the Naive Bayes and Support Vector Machine (SVM) algorithms. The dataset consists of 10,000 comments collected through a scraping process. The data were then further processed using the Sastrawi library, including data cleaning, case folding, tokenization, stopword removal, and stemming. The TF-IDF method with a dimensional size of (9048, 5000) was used for feature representation. The data distribution indicates a class imbalance, where neutral (4,550) and positive (4,297) sentiments dominate compared to negative (201). The experimental results show that SVM outperforms Naive Bayes, achieving an accuracy of 93% compared to 86%. However, both models encounter difficulties in identifying the negative class. The use of imbalanced data handling techniques shows that SVM with class weighting improves the detection of minority classes without reducing overall accuracy. In contrast, Naive Bayes with SMOTE improves recall but reduces overall performance. The findings indicate that SVM with class weighting is the best-performing model for sentiment classification. Additionally, the results show that the majority of comments are positive, suggesting that audiences respond favorably to the Wonderland Indonesia content.
Keywords: Sentiment Analysis, Naive Bayes, Support Vector Machine, YouTube, TF-IDF
Abstrak: Tujuan penelitian ini adalah untuk menganalisis dan mengklasifikasikan sentimen penonton terhadap video Wonderland Indonesia yang ditampilkan di platform YouTube menggunakan algoritma Naive Bayes dan Support Vector Machine (SVM). Data yang digunakan terdiri dari 10.000 komentar yang dikumpulkan melalui proses scraping. Kemudian, data diproses lebih lanjut dengan menggunakan pustaka Sastrawi untuk membersihkan, menggabungkan case, tokenize, menghilangkan stopword, dan stemming. Metode TF-IDF dengan dimensi (9048, 5000) digunakan untuk mempresentasikan fitur. Distribusi data menunjukkan ketidakseimbangan kelas; sentimen netral (4.550) dan positif (4.297) mendominasi dibandingkan dengan negatif (201). Hasil pengujian menunjukkan bahwa SVM unggul dengan akurasi 93% dibandingkan Naive Bayes sebesar 86%. Namun, kedua model kesulitan menemukan kelas negatif. SVM dengan berat kelas dapat meningkatkan deteksi kelas minoritas tanpa mengurangi akurasi, seperti yang ditunjukkan oleh penggunaan teknik penanganan data tidak seimbang. Di sisi lain, Naive Bayes dengan SMOTE meningkatkan recall tetapi mengurangi performa keseluruhan. Hasil penelitian menunjukkan bahwa model terbaik untuk klasifikasi sentimen adalah SVM dengan berat kelas. Hasil penelitian juga menunjukkan bahwa sebagian besar komentar bersentimen positif, yang menunjukkan bahwa audiens menyukai konten Wonderland Indonesia.
Kata Kunci: Analisis Sentimen, Naive Bayes, SVM, YouTube, TF-IDF
Downloads
References
Al Fayed, A. J., Darma, S., Sinabariba, Z., & Pardede, S. M. P. (2025). Comparison of Naïve Bayes, K-Nearest Neighbors, and Decision Tree methods for classifying heart disease risk factors. Journal of Computer Science and Research (JoCoSiR), 3(3), 81–88.
Al-fraiji, S. S., & Al?Shammary, D. (2021). EEG Signals Classification based on mathematical selection and cosine similarity. Journal of Al-Qadisiyah for Computer Science and Mathematics, 13(3). https://doi.org/10.29304/jqcm.2021.13.3.837
Darma, S., Al Fayed, A. J., Pardede, S. M. P., Aqsha, M. H., & Novelan, M. S. (2026). Predictive analysis of flood risk factors based on a machine learning approach: Comparative study of SVM and XGBoost algorithms. Journal of Technology and Computer (JOTECHCOM), 3(1), 24–33. https://journal.technolabs.co.id/index.php/jotechcom/article/view/94
Fadhilah, S. N., & Utomo, F. S. (2024). Naïve Bayes Algorithm for Sentiment Analysis of Blibli.com Review on Google Play Store. Sistemasi, 13(2), 831. https://doi.org/10.32520/stmsi.v13i2.3887
Fokoué, E. (2018). To Bayes or Not To Bayes? That’s no longer the question! https://doi.org/10.48550/arxiv.1805.11012
Gregory, D. E. (2012). Choosing a Graduate School. Educational Horizons, 90(3), 5–9. https://doi.org/10.1177/0013175x1209000302
Hidayat, W., Utami, E., & Hartanto, A. D. (2021). Pemilihan Parameter Terbaik pada Algoritma Winnowing dalam Mendeteksi Tingkat Kesamaan Dokumen Bahasa Indonesia. Creative Information Technology Journal, 7(2), 119. https://doi.org/10.24076/citec.2020v7i2.256
Hoiles, W., Krishnamurthy, V., & Pattanayak, K. (2019). Rationally Inattentive Inverse Reinforcement Learning Explains YouTube Commenting Behavior. https://doi.org/10.48550/arxiv.1910.11703
Iskandar, A. F., Utami, E., Hidayat, W., Budi, A. P., & Hartanto, A. D. (2023). Modifikasi Fonem Vokal Pada Stemming Kata Tidak Baku. Jurnal Teknologi Informasi Dan Ilmu Komputer, 10(1),35. https://doi.org/10.25126/jtiik.20231015028
Mavhemwa, P. M., Zennaro, M., Nsengiyumva, P., & Nzanywayingoma, F. (2024). Weighted naïve bayes multi-user classification for adaptive authentication. Journal of Physics Communications, 8(10), 105005. https://doi.org/10.1088/2399-6528/ad8a16
Meyer, M. S., Cranmore, J., Rinn, A. N., & Hodges, J. (2020). College Choice: Considerations for Academically Advanced High School Seniors. Gifted Child Quarterly, 65(1), 52–74. https://doi.org/10.1177/0016986220957258
Missaghian, R. (2021). Social Capital and Post-Secondary Decision-Making Alignment for Low-Income Students. Social Sciences, 10(3), 83. https://doi.org/10.3390/socsci10030083
Poornima, K. M. (2024). Rating Based on YouTube Comments. Interantional Journal of Scientific Research in Engineering and Management, 08(03), 1–5. https://doi.org/10.55041/ijsrem29798
Pokharel, R., & Bhatta, D. (2021). Classifying YouTube Comments Based on Sentiment and Type of Sentence. https://doi.org/10.48550/arxiv.2111.01908
Puspasari, H. M., & Subarkah, P. (2022). Sentiment Analysis for Opinions on the Covid-19 Vaccination Program Using a Naive Bayes Classifier. Jurnal Borneo Administrator, 18(3), 213–230. https://doi.org/10.24258/jba.v18i3.992
Reynolds, J., Elliott, J. L., Castillo, K., Sliwak, R. M., & Halligan, C. S. (2023). I lost my mentor, now what? The experiences of counseling psychology women doctoral students who lost their mentor: Training and program implications. Qualitative Psychology, 10(2), 227–244. https://doi.org/10.1037/qup0000237
Saifullah, S., Fauziyah, Y., & Aribowo, A. S. (2021). Comparison of machine learning for sentiment analysis in detecting anxiety based on social media data. Jurnal Informatika, 15(1), 45. https://doi.org/10.26555/jifo.v15i1.a2
Sangeetha, M., & Nimala, K. (2024). Unravelling Emotional Tones: A Hybrid Optimized Model for Sentiment Analysis in Tamil Regional Languages. Journal of Machine and Computing, 114–126. https://doi.org/10.53759/7669/jmc202404012
Schelfhout, S., Wille, B., Fonteyne, L., Roels, E., Derous, E., Fruyt, F. D., & Duyck, W. (2021). How interest fit relates to STEM study choice: Female students fit their choices better. Journal of Vocational Behavior, 129, 103614. https://doi.org/10.1016/j.jvb.2021.103614
Subha, K., & Bharathi, N. (2024). Leveraging spark-based machine learning algorithm for audience sentiment analysis in youtube content. Intelligent Data Analysis, 28(5), 1395–1405. https://doi.org/10.3233/ida-240198
Taslim, T., Handayani, S., & Fajrizal, F. (2023). Kinerja Komparatif Optimasi Algoritma Naive Bayes dalam Klasifikasi Teks untuk Uji Klinis Kanker. Eksplora Informatika, 13(1), 113–123. https://doi.org/10.30864/eksplora.v13i1.994
Veness, J., Hütter, M., Orseau, L., & Bellemare, M. G. (2014). Online Learning of k-CNF Boolean Functions. https://doi.org/10.48550/arxiv.1403.6863
Downloads
Published
Issue
Section
License
Copyright (c) 2026 Surya Darma, Linda Wahyuni, Nauval Alfarizi

This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.




