PIPELINE MACHINE LEARNING PREDIKSI RISIKOPENYAKIT DARI DATA KUESIONER
DOI:
https://doi.org/10.54314/jssr.v9i2.6172Kata Kunci:
Naive Bayes, Disease Risk PredictionAbstrak
This study aims to design a system capable of predicting disease risk levels using the Naive Bayes algorithm and questionnaire data. Respondent data was collected via Google Forms, covering variables such as age, lifestyle, stress levels, sleep quality, physical activity, and family health history. The data underwent preprocessing and was converted into a tabular format before being divided into 80 training data points and 20 testing data points. The Naive Bayes algorithm was used to classify disease risk into low, moderate, and high categories. Test results showed that the model could generate predictions with a high level of accuracy. The trained model was then implemented in a web-based system so that users could easily and quickly determine their disease risk.
Unduhan
Referensi
World Health Organization, “Noncommunicable diseases (Fact sheet),” WHO, 2025.
Artika, Resti Dwi, et al. "Perancangan Data Pipeline Untuk Analisis Pola Perjalanan dan Permintaan Layanan Transjakarta." Jurnal Informatika dan Teknik Elektro Terapan 13.3S1 (2025).
I. H. Sarker, “A review: Data pre-processing and data augmentation techniques,” Array, vol. 16, p. 100273, 2022.
IBM, IBM SPSS Modeler CRISP-DM Guide. IBM Documentation, 2023.
Musababa, M. A., & Fachrie, M. (2025). Data Streaming Pipeline Model Using DBSTREAM-Based Online Machine Learning for E-Commerce User Segmentation. Journal of Applied Informatics and Computing, 9(6).
Adam, F.A.B, Berliana R, Khoirun N. (2024). Penerapan Metode Naïve Bayes dengan SMOTE pada Sistem Pendukung Keputusan untuk Prediksi Risiko Stroke.
?Amelia, R., Rozi, F., Anggraini, D., & Rosyani, P. (2025). Perbandingan Model Machine Learning dalam Prediksi Penyakit Jantung dengan Optimalisasi Fitur Gejala dan Faktor Risiko. Jurnal Pengabdian Masyarakat dan Riset Pendidikan
Yu, X., et al. (2026). Web-based cardiovascular disease risk prediction using machine learning: Integrating feature evaluation and SHAP-based visualizations. Frontiers in Artificial Intelligence.
Rahmada, A., & Susanto, E. R. (2025). Prediksi Risiko Penyakit Jantung Sederhana Menggunakan Algoritma Random Forest Classifier dengan Data Gaya Hidup Siswa. Jurnal Manajemen Informatika Jayakarta.
Hidayat, R., et al. (2024). Implementasi Machine Learning Untuk Prediksi Penyakit Jantung Menggunakan Algoritma Support Vector Machine. BIOS: Jurnal Teknologi Informasi dan Rekayasa Komputer.
Sihombing, P. R., & Yuliati, I. F. (2021). Penerapan Metode Machine Learning dalam Klasifikasi Risiko Kejadian Berat Badan Lahir Rendah di Indonesia. MATRIK: Jurnal Manajemen, Teknik Informatika dan Rekayasa Komputer.
Pratama, A. S., et al. (2024). "Sistem Pendukung Keputusan Pemilihan Karyawan Terbaik Menggunakan Integrasi Metode AHP dan Algoritma Random Forest." Jurnal Sistim Informasi dan Teknologi.
Santoso, B., & Wijaya, K. (2026). "Optimasi Pipeline Preprocessing pada Data Survei Skala Besar Menggunakan Framework Scikit-Learn." Jurnal Nasional Teknik Elektro dan Teknologi Informasi (JNTETI).
Ramadhan, M. F., et al. (2024). "Pengembangan Model Predictive Analytics untuk Penentuan Penerima Beasiswa Menggunakan Pipeline Otomatis." Jurnal Media Informatika Budidarma.
Huda, N., & Amalia, R. (2025). "Implementasi Pipeline Machine Learning untuk Klasifikasi Kepuasan Pelanggan Berdasarkan Data Kuesioner Digital." Jurnal Teknologi Informasi dan Ilmu Komputer (JTIIK).
Azeraf, E., Monfrini, E., & Pieczynski, W. (2021). Using the Naïve Bayes as a discriminative model. 2021
Unduhan
Diterbitkan
Terbitan
Bagian
Lisensi
Hak Cipta (c) 2026 Anggiat Roberto Sinaga, Siti Aisyah, Dheo Zakaria Harahap, Anindyia Sitorus Pane, Sendy Valiza A. Ginting

Artikel ini berlisensiCreative Commons Attribution-ShareAlike 4.0 International License.




