DETEKSI ZERO-DAY SOCIAL ENGINERING ATTACK MENGGUNAKAN NLP DAN OPEN-SET DEEP LEARNING

Authors

  • Sahren Sahren SekoahTinggi Manajemen Informatika dan Komputer Royal
  • Ruri Ashari Dalimunthe
  • Bima Aditya

DOI:

https://doi.org/10.54314/jssr.v9i2.6120

Keywords:

NLP, OSIDS, Social_Engineering Attack, TF-IDF_Trigram, Zero-Day

Abstract

Text based social engineering attacks are a growing cyber threat that is difficult to detect by conventional intrusion detection systems, especially in previously unobserved or zero-day variants. This study proposes a Natural Language Processing Open-Set Intrusion Detection System (NLP-OSIDS) framework that integrates Term Frequency-Inverse Document Frequency (TF-IDF) trigram (1.3-gram) feature representation with an Open-Set Multilayer Perceptron architecture based on energy based scoring to detect zero-day social engineering attacks without requiring training examples from that class. Experiments were conducted on the public dataset phishing_email.csv with 82,486 combined samples from Enron, SpamAssassin, Nazario, Ling, CEAS, and Nigerian Fraud datasets with strict zero-day partitioning following open-set recognition evaluation standards. The results show that NLP-OSIDS achieved an AUROC of 0.7808, surpassing all closed-set baselines (AUROC = 0.500) with the lowest False Positive Rate of 0.0088, while the Zero-Day Detection Rate (ZD-DR) of 0.077 indicates the need for adaptive threshold optimization as a direction for further research.

Downloads

Download data is not yet available.

References

Andri Yusda, R., Fitri Larasti Sibuea, M., Meutia Arifin, N., Aditya, B., & Royal, U. (2025). Seleksi Fitur Menggunakan Mutual Information Untuk Deteksi Intrusi. Journal of Science and Social Research, 4307(3), 3482–3490. http://jurnal.goretanpena.com/index.php/JSSR

Atawneh, S., & Aljehani, H. (2023). Phishing Email Detection Model Using Deep Learning. Electronics (Switzerland), 12(20). https://doi.org/10.3390/electronics12204261

Doshi, J., Parmar, K., Sanghavi, R., & Shekokar, N. (2023). A comprehensive dual-layer architecture for phishing and spam email detection. Computers and Security, 133, 103378. https://doi.org/10.1016/j.cose.2023.103378

Gogoi, B., & Ahmed, T. (2022). Phishing and Fraudulent Email Detection through Transfer Learning using pretrained transformer models. INDICON 2022 - 2022 IEEE 19th India Council International Conference.

Haoxing, Z., & System, C. (2024). Federal Bureau of Investigation Internetn Crime Report. 1–47.

He, D., Lv, X., Xu, X., Yu, S., Li, D., Chan, S., & Guizani, M. (2022). An Effective Double-Layer Detection System Against Social Engineering Attacks. IEEE Network, 36(6), 92–98.

Hendrycks, D., Mazeika, M., & Dietterich, T. (2019). Deep anomaly detection with outlier exposure. 7th International Conference on Learning Representations, ICLR 2019, 1–18.

Hylender, D., Langlois, P., Pinto, A., & Widup, S. (2024). 2024 Data Breach Investigations Report. 100. https://www.verizon.com/business/resources/Tad3/reports/2024-dbir-data-breach-investigations-report.pdf

Sathe, D. A. K., Dilip, P. D., Vishnu, L. G., & Ramdas, S. (2025). Social Engineering Attack: Understanding Human Vulnerability in Cybersecurity. 10(3). https://doi.org/10.25215/2455/1003098

Thakur, K., Ali, M. L., Obaidat, M. A., & Kamruzzaman, A. (2023). A Systematic Review on Deep-Learning-Based Phishing Email Detection. Electronics (Switzerland), 12(21), 1–26. https://doi.org/10.3390/electronics12214545

Wang, H., Vaze, S., & Han, K. (2025). Dissecting Out-of-Distribution Detection and Open-Set Recognition: A Critical Analysis of Methods and Benchmarks. International Journal of Computer Vision, 133(3), 1326–1351. https://doi.org/10.1007/s11263-024-02222-4

Wei, Y., Nakayama, M., & Sekiya, Y. (2025). Enhancing Generalization in Phishing URL Detection via a Fine-Tuned BERT-Based Multimodal Approach. IEEE Access, 13, 131197–131216. https://doi.org/10.1109/ACCESS.2025.3591843

Downloads

Published

2026-05-01

Issue

Section

Artikel

How to Cite

DETEKSI ZERO-DAY SOCIAL ENGINERING ATTACK MENGGUNAKAN NLP DAN OPEN-SET DEEP LEARNING. (2026). JOURNAL OF SCIENCE AND SOCIAL RESEARCH, 9(2), 1900-1906. https://doi.org/10.54314/jssr.v9i2.6120