Language models and ontologies, security threats in distributed system

Donskikh N.I.

UDC 004.056.5
DOI: 10.26102/2310-6018/2024.46.3.016

Abstract
List of references
About authors

Research in the field of large language models and natural language processing systems has intensified due to the emergence of new, latent and serious risks, for example, violations of the output generation processes, malicious requests in automatic mode. Synergistic scenarios for large language models are being developed. The main hypothesis taken into account in this study is the possibility of insurance (with a given probability) from the generation of prohibited content and its "mixing" with the user query, taking into account ontological properties and connections to improve the quality of search in practical tasks, for example, using an ontology library. Methods of analysis-synthesis, modeling-forecasting, expert-heuristic, probability theory and decision-making were used. The main results of the article: 1) analytics on the problems of applying large language models in achieving stability in the system infrastructure (a table of key methods was proposed); 2) a language model of network infrastructure stability based on estimates of distributions when mixing words is proposed, which uses the Bayesian method; 3) a similar language model was proposed and studied on the basis of an expert-heuristic approach to assessing risks (uncertainties in the system), in particular, using an information-entropy approach. Research can be developed by complicating models (hypotheses) and the "depth" of risk accounting.

1. Liu Y., Deng G., Li Y. et al. Prompt Injection attack against LLM-integrated Applications. URL: https://doi.org/10.48550/arXiv.2306.05499 [Accessed 14th June 2024].

2. Martínez Torres J., Iglesias Comesaña C., García-Nieto P.J. Review: machine learning techniques applied to cybersecurity. International Journal of Machine Learning and Cybernetics. 2019;10(10):2823–2836. https://doi.org/10.1007/s13042-018-00906-1

3. Kuzminov I.F., Bakhtin P.D., Timofeev A.A. et al. Modern Natural Language Processing Technologies for Solving Strategic Analytics Tasks. Iskusstvennyi intellekt i prinyatie reshenii = Artificial Intelligence and Decision Making. 2020;(1):3–16. (In Russ.). https://doi.org/10.14357/20718594200101

4. Mudarova R., Namiot D. Countering Prompt Injection attacks on large language models. International Journal of Open Information Technologies. 2024;12(5):39–48. (In Russ.).

5. Jurgel V.Yu. Complexities of natural language modeling. Vestnik nauki i obrazovaniya = Herald of Science and Education. 2019;(23-1):12–14. (In Russ.).

6. Fang H., Fang G., Yu T., Li P. Efficient Greedy Coordinate Descent via Variable Partitioning. In: 37th Conference on Uncertainty in Artificial Intelligence (UAI 2021): Proceedings, 27–30 July 2021, Toronto, Canada, USA. PMLR; 2021. pp. 547–557.

7. Chen X., Zhang N., Xie X. et al. KnowPrompt: Knowledge-aware Prompt-tuning with Synergistic Optimization for Relation Extraction. In: WWW '22: Proceedings of the ACM Web Conference 2022, 25–29 April 2022, Lyon, France. New York: Association for Computing Machinery; 2022. pp. 2778–2788. https://doi.org/10.1145/3485447.3511998

8. Fridman A.Ya. Ontology for designing situational digital twins of industrial-natural complexes for modeling their structural safety. Ontologiya proektirovaniya = Ontology of Designing. 2024;14(1):29–41. (In Russ.). https://doi.org/10.18287/2223-9537-2024-14-1-29-41

9. Dauphin Y.N., Fan A., Auli M., Grangier D. Language Modeling with Gated Convolutional Networks. In: 34th International Conference on Machine Learning: Proceedings, 6–11 August 2017, Sydney, Australia. 2017. pp. 933–941.

10. Kaziev M.V., Medvedeva L.B., Tyutrin N.O., Khizbullin F.F., Takhumova V.O. Improvement and modeling of the company's activity based on the innovative KPI system. Journal of Fundamental and Applied Sciences. 2018;10(5S):1406–1415.

Donskikh Nikita Igorevich

Financial University under the Government of the Russian Federation

Moscow, Russian

Keywords: large language models, resilience, risks, information security, governance

For citation: Donskikh N.I. Language models and ontologies, security threats in distributed system. Modeling, Optimization and Information Technology. 2024;12(3). URL: https://moitvivt.ru/ru/journal/pdf?id=1634 DOI: 10.26102/2310-6018/2024.46.3.016 (In Russ).

593

Full text in PDF

Received 17.07.2024

Revised 30.07.2024

Accepted 02.08.2024

Published 30.09.2024