Keywords: machine learning, phishing, phishing link detection system, security operation center, explainable artificial intelligence, large language model
Phishing link detection system based on explainable AI technologies
UDC 004.056
DOI: 10.26102/2310-6018/2025.51.4.028
A set of models for analyzing symbolic domain names in the tasks of detecting phishing links has been developed based on the construction of an ensemble of classifiers that are optimized for hardware platforms. This allows for increased efficiency of analysis when integrated into existing information security operation centers. The results of testing on real data for key metrics confirm the high accuracy of detecting malicious links. Software with a microservice architecture has been developed for integration into the information system of the security operation center. The proposed models are optimized for use on CPU by translating them into compiled code, which increased the computational performance of the models by 26 %. Classifier models based on the Code-BERT transformer, retrained on a prepared data set, are proposed. Modules of the subsystem for explaining the decision taken have been developed using methods of explainable artificial intelligence – the use of techniques for composing a query for a locally deployed large language model with a description of the signs of malicious links using zero-shot learning.
1. Karpova N.E., Voskanyan I.I. Threat of Social Engineering and Phishing in Modern Information Security. Digital Technology Security. 2024;(2):69–78. (In Russ.). https://doi.org/10.17212/2782-2230-2024-2-69-78
2. Vasilyev V., Vulfin A., Kuchkarova N. Automation of Software Vulnerabilities Analysis on the Basis of Text Mining Technology. Voprosy kiberbezopasnosti. 2020;(4):22–31. (In Russ.).
3. Kutlyev D.Z., Shmanina A.V. Ispol'zovanie algoritmov mashinnogo obucheniya dlya zashchity ot URL-fishinga. In: Mavlyutovskie chteniya: Materialy XV Vserossiiskoi molodezhnoi nauchnoi konferentsii: Volume 4, 26–28 October 2021, Ufa, Russia. Ufa: Ufa State Aviation Technical University; 2021. P. 430–435. (In Russ.).
4. Tonkal Ö., Polat H., Başaran E., Cömert Z., Kocaoğlu R. Machine Learning Approach Equipped with Neighbourhood Component Analysis for DDoS Attack Detection in Software-Defined Networking. Electronics. 2021;10(11). https://doi.org/10.3390/electronics10111227
5. Alshingiti Z., Alaqel R., Al-Muhtadi J., Haq Q.E.U., Saleem K., Faheem M.H. A Deep Learning-Based Phishing Detection System Using CNN, LSTM, and LSTM-CNN. Electronics. 2023;12(1). https://doi.org/10.3390/electronics12010232
6. Karim A., Shahroz M., Mustofa Kh., Belhaouari S.B., Joga S.R.K. Phishing Detection System Through Hybrid Machine Learning Based on URL. IEEE Access. 2023;11:36805–36822. https://doi.org/10.1109/ACCESS.2023.3252366
7. Vasilyev V., Vulfin A., Kuchkarova N. Assessment of Current Threats to Information Security Using Transformer Technology. Voprosy kiberbezopasnosti. 2022;(2):27–38. (In Russ.).
8. Lukmanova K.A., Kartak V.M. Recognition of Phishing Links Using Machine Learning Methods. Digital Technology Security. 2024;(3):9–20. (In Russ.). https://doi.org/10.17212/2782-2230-2024-3-9-20
9. Arrieta A.B., Díaz-Rodríguez N., Del Ser J., et al. Explainable Artificial Intelligence (XAI): Concepts, Taxonomies, Opportunities and Challenges Toward Responsible AI. Information Fusion. 2020;58:82–115. https://doi.org/10.1016/j.inffus.2019.12.012
10. Mahdaouy A.E., Lamsiyah S., Idrissi M.J., Alami H., Yartaoui Z., Berrada I. DomURLs_BERT: Pre-Trained BERT-Based Model for Malicious Domains and URLs Detection and Classification. arXiv. URL: https://arxiv.org/abs/2409.09143 [Accessed 29th August 2025].
11. Maneriker P., Stokes J.W., Lazo E.G., Carutasu D., Tajaddodianfar F., Gururajan A. URLTran: Improving Phishing URL Detection Using Transformers. In: MILCOM 2021 – 2021 IEEE Military Communications Conference (MILCOM), 29 November – 02 December 2021, San Diego, CA, USA. IEEE; 2021. P. 197–204. https://doi.org/10.1109/MILCOM52596.2021.9653028
12. Yang Yu, Li H., Jing D. Detection of Malicious URL Based on BERT-CNN. In: 2023 International Conference on Computer Science and Automation Technology (CSAT), 06–08 October 2023, Shanghai, China. IEEE; 2023. P. 284–288. https://doi.org/10.1109/CSAT61646.2023.00079
13. Tsai Yu.-D., Liow C., Siang Y.-Sh., Lin Sh.-D. Toward More Generalized Malicious URL Detection Models. In: AAAI 2024: Thirty-Eighth AAAI Conference on Artificial Intelligence, IAAI 2024: Thirty-Sixth Conference on Innovative Applications of Artificial Intelligence, EAAI 2014: Fourteenth Symposium on Educational Advances in Artificial Intelligence, 20–27 February 2024, Vancouver, Canada. AAAI Press; 2024. P. 21628–21636. https://doi.org/10.1609/aaai.v38i19.30161
14. Rao R.S., Vaishnavi T., Pais A.R. CatchPhish: Detection of Phishing Websites by Inspecting URLs. Journal of Ambient Intelligence and Humanized Computing. 2020;11(2):813–825. https://doi.org/10.1007/s12652-019-01311-4
15. Alsowail R.A. Anomaly Detection Based Capsnet for Malicious URL Detection System. Wireless Networks. 2025;31:3785–3801. https://doi.org/10.1007/s11276-025-03960-0
16. Rashid F., Ranaweera N., Doyle B., Seneviratne S. LLMs Are One-Shot URL Classifiers and Explainers. Computer Networks. 2025;258. https://doi.org/10.1016/j.comnet.2024.111004
17. Mailewa A., Mengel S., Gittner L., Khan H. Mechanisms and Techniques to Enhance the Security of Big Data Analytic Framework with MongoDB and Linux Containers. Array. 2022;15. https://doi.org/10.1016/j.array.2022.100236
Keywords: machine learning, phishing, phishing link detection system, security operation center, explainable artificial intelligence, large language model
For citation: Shaimardanov A.F., Vulfin A.M., Kirillova A.D., Minko A.V. Phishing link detection system based on explainable AI technologies. Modeling, Optimization and Information Technology. 2023;11(2). URL: https://moitvivt.ru/ru/journal/pdf?id=2066 DOI: 10.26102/2310-6018/2025.51.4.028 .
Received 03.09.2025
Revised 15.10.2025
Accepted 27.10.2025
Published 30.06.2023