EVALUASI MODEL HYBRID NAIVE BAYES-XGBOOST UNTUK KLASIFIKASI SENTIMEN NETIZEN TERHADAP ISU FREE PALESTINE PADA PLATFORM X DI INDONESIA TAHUN 2025
Abstract
Abstract: The Palestinian conflict has become a global issue that has triggered significant public responses, especially through social media platforms. This study aims to evaluate the performance of a hybrid Naïve Bayes–XGBoost model in classifying netizen sentiment toward the Free Palestine issue on platform X (formerly Twitter) in the year 2025. Data were collected using the X (Twitter) API with keywords such as #freepalestine and #savegaza, then processed through a series of preprocessing stages and sentiment labeling using a lexicon-based approach. The dataset was then split into 80% training data and 20% testing data to compare the performance of the baseline Naïve Bayes model and the hybrid model. The evaluation results show that the baseline Naïve Bayes model achieved an accuracy of 75.7% and an F1-score of 76%, while the hybrid Naïve Bayes–XGBoost model achieved a significantly higher accuracy of 95.5% and an F1-score of 96%. These findings indicate that integrating the two algorithms improves both accuracy and balance in sentiment classification, especially for unstructured and imbalanced data. This study recommends the use of hybrid models for public opinion analysis on social media and suggests further development using deep learning approaches.
Keywords: Sentiment Analysis, Free Palestine, Naïve Bayes, XGBoost, Hybrid Model
Abstrak: Konflik Palestina menjadi isu global yang memicu respons besar dari masyarakat dunia, terutama melalui media sosial. Penelitian ini bertujuan untuk mengevaluasi performa model hybrid Naïve Bayes–XGBoost dalam mengklasifikasikan sentimen netizen terhadap isu Free Palestine di platform X (Twitter) tahun 2025. Data dikumpulkan menggunakan X (Twitter) API dengan kata kunci #freepalestine dan #savegaza, lalu diproses melalui tahapan preprocessing, dan pelabelan menggunakan pendekatan lexicon-based. Selanjutnya, data dibagi menjadi data latih (80%) dan data uji (20%) untuk membandingkan performa antara model Naïve Bayes dasar dan model hybrid. Hasil evaluasi menunjukkan bahwa model Naïve Bayes dasar menghasilkan akurasi 75,7% dan F1-score 76%, sedangkan model hybrid Naïve Bayes–XGBoost mencapai akurasi 95,5% dan F1-score 96%. Temuan ini menunjukkan bahwa integrasi kedua algoritma mampu meningkatkan akurasi dan keseimbangan klasifikasi sentimen, khususnya pada data yang tidak terstruktur dan imbalanced. Penelitian ini merekomendasikan penggunaan model hybrid untuk analisis opini publik dimedia sosial, serta pengembangan lebih lanjut menggunakan pendekatan deep learning.
Kata kunci: Sentimen, Free Palestine, Naïve Bayes, XGBoost, Hybrid Model
Full Text:
PDFReferences
Attai, K., Asuquo, D., Okonny, K. E.,Johnson, E. A., Bassey, A., John, A., Bardi, I., Iroanwusi, C., & Michael, O. (2024). Sentiment Analysis of Twitter Discourse on the 2023 Nigerian General Elections. European Journal of Computer Science and Information Technology, 12(4), 18–35. https://doi.org/10.37745/ejcsit.2013/vol12n41835
Burnwal, Y., & Jaiswal, Dr. R. C. (2023). A Comprehensive Survey on Prediction Models and the Impact of XGBoost. International Journal for Research in Applied Science and Engineering Technology, 11(12), 1552–1556. https://doi.org/10.22214/ijraset.2023.57625
Cerrahoğlu, E., & Cihan, P. (2023). Sentiment Analysis and Emojification of Tweets. International Conference on Pioneer and Innovative Studies, 481–486. https://doi.org/10.59287/icpis.876
Gangwar, A., & Mehta, T. (2022). Sentiment Analysis of Political Tweets for Israel using Machine Learning. Springer Proceedings in Mathematics and Sta- Tistics, 1–10. https://doi.org/https://doi.org/10.48550/arXiv.2204.06515
Goutte, C., & Gaussier, E. (2005). A Probabilistic Interpretation of Precision, Recall and F-Score, with Implication for Evaluation. In D. E. Losada & J. M. Fernández-Luna (Eds.), Advances in Information Retrieval (pp. 345–359). Springer Berlin Heidelberg.
Hafizah, R., Saragih, T. H., Muliadi, M., Indriani, F., & Mazdadi, M. I. (2025). Machine Learning Implementation for Sentiment Analysis on X/Twitter: Case Study of Class Of Champions Event in Indonesia. Indonesian Journal of Electronics, Electromedical Engineering, and Medical Informatics, 7(2), 370–386. https://doi.org/10.35882/ijeeemi.v7i2.81
Hendrawan, I. R., Utami, E., & Hartanto, A. D. (2022). Comparison of Naïve Bayes Algorithm and XGBoost on Local Product Review Text Classification. Edumatic: Jurnal Pendidikan Informatika, 6(1), 143–149. https://doi.org/10.29408/edumatic.v6i1.5613
Irwanto, A., & Goeirmanto, L. (2023). Sentiment Analysis from Twitter about Covid-19 Vaccination in Indonesia using Naïve Bayes and XGboost Classifier Algorithm. Sinergi, 27(2), 145–152. https://doi.org/10.22441/sinergi.2023.2.001
Jonnala, N. S., Ram Teja, A. V. S., Rajeswari, S. R., Jakeer, S., Dheeraj, A., Bansal, S., Prakash, K., Singh, S., Faruque, M. R. I., & Al-mugren, K. S. (2025). Leveraging hybrid model for accurate sentiment analysis of Twitter data. Scientific Reports, 15(1). https://doi.org/10.1038/s41598-025-09794-2
Kowsari, K., Meimandi, K. J., Heidarysafa, M., Mendu, S., Barnes, L., & Brown, D. (2019). Text classification algorithms: A survey. In Information (Switzerland) (Vol. 10, Issue 4). MDPI AG. https://doi.org/10.3390/info10040150
Liu, D. (2023). Improvement of Naive Bayes Text Classifier Based on Ensemble Technology and Feature Engineering. Proceedings of the 2023 International Conference on Image, Algorithms and Artificial Intelligence, 557–563. https://doi.org/10.2991/978-94-6463-300-9_57
Mohammed, I., & Prasad, R. (2023). Building lexicon-based sentiment analysis model for low-resource languages. MethodsX, 11.
Negara, A. B. P. (2023). The Influence Of Applying Stopword Removal And Smote On Indonesian Sentiment Classification. LONTAR KOMPUTER, 14(3), 172–185. https://doi.org/10.24843/LKJITI.2023.v14.i03.p05
Nugraha, I. G. B. B., & Rizqullah, R. D. (2019). Normalisasi Kata Tidak Baku yang Tidak Disingkat dengan Jarak Perubahan. Jurnal Nasional Teknik Elektro Dan Teknologi Informasi (JNTETI), 8(3), 218–224.
Rianto, Mutiara, A. B., Wibowo, E. P., & Santosa, P. I. (2021). Improving the accuracy of text classification using stemming method, a case of non-formal Indonesian conversation. Journal of Big Data, 8(1). https://doi.org/10.1186/s40537-021-00413-1
Saputra, F. T., Wijaya, S. H., Nurhadryani, Y., & Defina. (2020). Lexicon Addition Effect on Lexicon-Based of Indonesian Sentiment Analysis on Twitter. Proceedings - 2nd International Conference on Informatics, Multimedia, Cyber, and Information System, ICIMCIS 2020, 136–141. https://doi.org/10.1109/ICIMCIS51567.2020.9354269
DOI: https://doi.org/10.54314/jssr.v8i3.4103
Article Metrics
Abstract view : 64 timesPDF - 15 times
Copyright (c) 2025 Richi Andrianto, Mustopa Husein Lubis, Urfi Utami, Asep Supriyanto