KLASIFIKASI SENTIMEN BERBASIS MACHINE LEARNING TERHADAP PENONTON “WONDERLAND INDONESIA” PADA PLATFORM YOUTUBE

Authors

  • Surya Darma Universitas Potensi Utama
  • Linda Wahyuni Universitas Potensi Utama
  • Nauval Alfarizi Universitas Pembangunan Panca Budi

DOI:

https://doi.org/10.54314/jssr.v9i2.6270

Keywords:

Sentiment Analysis, Naive Bayes, Support Vector Machine, YouTube, TF-IDF

Abstract

Abstract: The purpose of this study is to analyze and classify audience sentiment toward the Wonderland Indonesia video on the YouTube platform using the Naive Bayes and Support Vector Machine (SVM) algorithms. The dataset consists of 10,000 comments collected through a scraping process. The data were then further processed using the Sastrawi library, including data cleaning, case folding, tokenization, stopword removal, and stemming. The TF-IDF method with a dimensional size of (9048, 5000) was used for feature representation. The data distribution indicates a class imbalance, where neutral (4,550) and positive (4,297) sentiments dominate compared to negative (201). The experimental results show that SVM outperforms Naive Bayes, achieving an accuracy of 93% compared to 86%. However, both models encounter difficulties in identifying the negative class. The use of imbalanced data handling techniques shows that SVM with class weighting improves the detection of minority classes without reducing overall accuracy. In contrast, Naive Bayes with SMOTE improves recall but reduces overall performance. The findings indicate that SVM with class weighting is the best-performing model for sentiment classification. Additionally, the results show that the majority of comments are positive, suggesting that audiences respond favorably to the Wonderland Indonesia content.

Keywords: Sentiment Analysis, Naive Bayes, Support Vector Machine, YouTube, TF-IDF

Abstrak: Tujuan penelitian ini adalah untuk menganalisis dan mengklasifikasikan sentimen penonton terhadap video Wonderland Indonesia yang ditampilkan di platform YouTube menggunakan algoritma Naive Bayes dan Support Vector Machine (SVM). Data yang digunakan terdiri dari 10.000 komentar yang dikumpulkan melalui proses scraping. Kemudian, data diproses lebih lanjut dengan menggunakan pustaka Sastrawi untuk membersihkan, menggabungkan case, tokenize, menghilangkan stopword, dan stemming. Metode TF-IDF dengan dimensi (9048, 5000) digunakan untuk mempresentasikan fitur. Distribusi data menunjukkan ketidakseimbangan kelas; sentimen netral (4.550) dan positif (4.297) mendominasi dibandingkan dengan negatif (201). Hasil pengujian menunjukkan bahwa SVM unggul dengan akurasi 93% dibandingkan Naive Bayes sebesar 86%. Namun, kedua model kesulitan menemukan kelas negatif. SVM dengan berat kelas dapat meningkatkan deteksi kelas minoritas tanpa mengurangi akurasi, seperti yang ditunjukkan oleh penggunaan teknik penanganan data tidak seimbang. Di sisi lain, Naive Bayes dengan SMOTE meningkatkan recall tetapi mengurangi performa keseluruhan. Hasil penelitian menunjukkan bahwa model terbaik untuk klasifikasi sentimen adalah SVM dengan berat kelas. Hasil penelitian juga menunjukkan bahwa sebagian besar komentar bersentimen positif, yang menunjukkan bahwa audiens menyukai konten Wonderland Indonesia.

Kata Kunci: Analisis Sentimen, Naive Bayes, SVM, YouTube, TF-IDF

Downloads

Download data is not yet available.

References

Al Fayed, A. J., Darma, S., Sinabariba, Z., & Pardede, S. M. P. (2025). Comparison of Naïve Bayes, K-Nearest Neighbors, and Decision Tree methods for classifying heart disease risk factors. Journal of Computer Science and Research (JoCoSiR), 3(3), 81–88.

Al-fraiji, S. S., & Al?Shammary, D. (2021). EEG Signals Classification based on mathematical selection and cosine similarity. Journal of Al-Qadisiyah for Computer Science and Mathematics, 13(3). https://doi.org/10.29304/jqcm.2021.13.3.837

Darma, S., Al Fayed, A. J., Pardede, S. M. P., Aqsha, M. H., & Novelan, M. S. (2026). Predictive analysis of flood risk factors based on a machine learning approach: Comparative study of SVM and XGBoost algorithms. Journal of Technology and Computer (JOTECHCOM), 3(1), 24–33. https://journal.technolabs.co.id/index.php/jotechcom/article/view/94

Fadhilah, S. N., & Utomo, F. S. (2024). Naïve Bayes Algorithm for Sentiment Analysis of Blibli.com Review on Google Play Store. Sistemasi, 13(2), 831. https://doi.org/10.32520/stmsi.v13i2.3887

Fokoué, E. (2018). To Bayes or Not To Bayes? That’s no longer the question! https://doi.org/10.48550/arxiv.1805.11012

Gregory, D. E. (2012). Choosing a Graduate School. Educational Horizons, 90(3), 5–9. https://doi.org/10.1177/0013175x1209000302

Hidayat, W., Utami, E., & Hartanto, A. D. (2021). Pemilihan Parameter Terbaik pada Algoritma Winnowing dalam Mendeteksi Tingkat Kesamaan Dokumen Bahasa Indonesia. Creative Information Technology Journal, 7(2), 119. https://doi.org/10.24076/citec.2020v7i2.256

Hoiles, W., Krishnamurthy, V., & Pattanayak, K. (2019). Rationally Inattentive Inverse Reinforcement Learning Explains YouTube Commenting Behavior. https://doi.org/10.48550/arxiv.1910.11703

Iskandar, A. F., Utami, E., Hidayat, W., Budi, A. P., & Hartanto, A. D. (2023). Modifikasi Fonem Vokal Pada Stemming Kata Tidak Baku. Jurnal Teknologi Informasi Dan Ilmu Komputer, 10(1),35. https://doi.org/10.25126/jtiik.20231015028

Mavhemwa, P. M., Zennaro, M., Nsengiyumva, P., & Nzanywayingoma, F. (2024). Weighted naïve bayes multi-user classification for adaptive authentication. Journal of Physics Communications, 8(10), 105005. https://doi.org/10.1088/2399-6528/ad8a16

Meyer, M. S., Cranmore, J., Rinn, A. N., & Hodges, J. (2020). College Choice: Considerations for Academically Advanced High School Seniors. Gifted Child Quarterly, 65(1), 52–74. https://doi.org/10.1177/0016986220957258

Missaghian, R. (2021). Social Capital and Post-Secondary Decision-Making Alignment for Low-Income Students. Social Sciences, 10(3), 83. https://doi.org/10.3390/socsci10030083

Poornima, K. M. (2024). Rating Based on YouTube Comments. Interantional Journal of Scientific Research in Engineering and Management, 08(03), 1–5. https://doi.org/10.55041/ijsrem29798

Pokharel, R., & Bhatta, D. (2021). Classifying YouTube Comments Based on Sentiment and Type of Sentence. https://doi.org/10.48550/arxiv.2111.01908

Puspasari, H. M., & Subarkah, P. (2022). Sentiment Analysis for Opinions on the Covid-19 Vaccination Program Using a Naive Bayes Classifier. Jurnal Borneo Administrator, 18(3), 213–230. https://doi.org/10.24258/jba.v18i3.992

Reynolds, J., Elliott, J. L., Castillo, K., Sliwak, R. M., & Halligan, C. S. (2023). I lost my mentor, now what? The experiences of counseling psychology women doctoral students who lost their mentor: Training and program implications. Qualitative Psychology, 10(2), 227–244. https://doi.org/10.1037/qup0000237

Saifullah, S., Fauziyah, Y., & Aribowo, A. S. (2021). Comparison of machine learning for sentiment analysis in detecting anxiety based on social media data. Jurnal Informatika, 15(1), 45. https://doi.org/10.26555/jifo.v15i1.a2

Sangeetha, M., & Nimala, K. (2024). Unravelling Emotional Tones: A Hybrid Optimized Model for Sentiment Analysis in Tamil Regional Languages. Journal of Machine and Computing, 114–126. https://doi.org/10.53759/7669/jmc202404012

Schelfhout, S., Wille, B., Fonteyne, L., Roels, E., Derous, E., Fruyt, F. D., & Duyck, W. (2021). How interest fit relates to STEM study choice: Female students fit their choices better. Journal of Vocational Behavior, 129, 103614. https://doi.org/10.1016/j.jvb.2021.103614

Subha, K., & Bharathi, N. (2024). Leveraging spark-based machine learning algorithm for audience sentiment analysis in youtube content. Intelligent Data Analysis, 28(5), 1395–1405. https://doi.org/10.3233/ida-240198

Taslim, T., Handayani, S., & Fajrizal, F. (2023). Kinerja Komparatif Optimasi Algoritma Naive Bayes dalam Klasifikasi Teks untuk Uji Klinis Kanker. Eksplora Informatika, 13(1), 113–123. https://doi.org/10.30864/eksplora.v13i1.994

Veness, J., Hütter, M., Orseau, L., & Bellemare, M. G. (2014). Online Learning of k-CNF Boolean Functions. https://doi.org/10.48550/arxiv.1403.6863

Downloads

Published

2026-04-30

Issue

Section

Artikel

How to Cite

KLASIFIKASI SENTIMEN BERBASIS MACHINE LEARNING TERHADAP PENONTON “WONDERLAND INDONESIA” PADA PLATFORM YOUTUBE. (2026). JOURNAL OF SCIENCE AND SOCIAL RESEARCH, 9(2), 2848-2856. https://doi.org/10.54314/jssr.v9i2.6270

Most read articles by the same author(s)