Адаптивное риск-ориентированное управление эксплуатацией объектов розничной сети на основе кластеризации и обучения с подкреплением
Работая с сайтом, я даю свое согласие на использование файлов cookie. Это необходимо для нормального функционирования сайта, показа целевой рекламы и анализа трафика. Статистика использования сайта обрабатывается системой Яндекс.Метрика
Научный журнал Моделирование, оптимизация и информационные технологииThe scientific journal Modeling, Optimization and Information Technology
Online media
issn 2310-6018

Adaptive risk-based management of retail network facilities based on clusterization and training with reinforcements

Ustimov M.G.,  Prokhorova O.K.,  Zalozhnikh D.O. 

UDC 658.87:519.876.5
DOI: 10.26102/2310-6018/2025.51.4.066

  • Abstract
  • List of references
  • About authors

In the context of increased operational and energy risks typical of modern retail chains, an innovative two-tier approach to facility operation management is proposed. The research is aimed at solving the key problem of heterogeneity of the risk profiles of network facilities, which requires differentiated management strategies instead of unified regulations. At the strategic level, intelligent clustering of objects using the Kohonen self-organizing maps (SOM) method has been implemented for complex risk factors, including geospatial parameters (distance from operational tension zones), infrastructural indicators (proximity to critical infrastructure, reliability of power grids), operational metrics (logistical stability, incident history) and socio-economic indicators. As a result of the cluster analysis, four clearly differentiated categories of objects were identified: critical, high-risk, logistically vulnerable and stable. At the tactical level, specialized Reinforcement Learning models have been developed for each cluster to adapt operational policies in real time. The formalization of the task as a Markov decision-making process made it possible to optimize control actions (maintenance, energy management, redundancy), taking into account the specific goals of the cluster. A key feature of the methodology is the customization of reward functions: priority is given to maximizing survivability for critical facilities, energy efficiency for stable ones, and balanced strategies for intermediate clusters. Experimental validation was performed on a synthesized dataset of 100 objects using modern machine learning libraries (Stable-Baselines3, Gymnasium, Scikit-learn) in a Docker WSL2 containerized environment.

1. Yu L., Qin Sh., Zhang M., Shen Ch., Jiang T., Guan X. A Review of Deep Reinforcement Learning for Smart Building Energy Management. IEEE Internet of Things Journal. 2021;8(15):12046–12063. https://doi.org/10.1109/JIOT.2021.3078462

2. Djenouri D., Laidi R., Djenouri Y., Balasingham I. Machine Learning for Smart Building Applications: Review and Taxonomy. ACM Computing Surveys. 2019;52(2). https://doi.org/10.1145/3311950

3. Pigott A., Crozier C., Baker K., Nagy Z. GridLearn: Multiagent Reinforcement Learning for Grid-Aware Building Energy Management. arXiv. URL: https://arxiv.org/pdf/2110.06396.pdf [Accessed 15th November 2025].

4. Mao R., Aggarwal V. NPSCS: Non-Preemptive Stochastic Coflow Scheduling with Time-Indexed LP Relaxation. IEEE Transactions on Network and Service Management. 2021;18(2):2377–2387. https://doi.org/10.1109/TNSM.2021.3051657

5. Al Sayed K., Boodi A., Broujeny R.S., Beddiar K. Reinforcement Learning for HVAC Control in Intelligent Buildings: A Technical and Conceptual Review. Journal of Building Engineering. 2024;95. https://doi.org/10.1016/j.jobe.2024.110085

6. Hillson D. Managing Risk in Projects. London: Routledge; 2016. 126 p.

7. Samunnisa K., Sunil Vijaya Kumar G., Madhavi K. Intrusion Detection System in Distributed Cloud Computing: Hybrid Clustering and Classification Methods. Measurement: Sensors. 2023;25. https://doi.org/10.1016/j.measen.2022.100612

8. Obasi I.Ch., Cheng P., Varianou-Mikellidou C., Dimopoulos Ch., Boustras G. Machine Learning for Occupational Accident Analysis: Applications, Challenges, and Future Directions. Journal of Safety Science and Resilience. 2026;7(1). https://doi.org/10.1016/j.jnlssr.2025.100250

9. Alhoniemi E., Hollmén J., Simula O., Vesanto J. Process Monitoring and Modeling Using the Self-Organizing Map. Integrated Computer Aided Engineering. 1998;6(1). https://doi.org/10.3233/ICA-1999-6102

10. Bouabdallaoui Y., Lafhaj Z., Yim P., Ducoulombier L., Bennadji B. Predictive Maintenance in Building Facilities: A Machine Learning-Based Approach. Sensors. 2021;21(4). https://doi.org/10.3390/s21041044

11. Schulman J., Wolski F., Dhariwal P., Radford A., Klimov O. Proximal Policy Optimization Algorithms. arXiv. URL: https://arxiv.org/abs/1707.06347 [Accessed 18th November 2025].

12. Mnih V., Kavukcuoglu K., Silver D., et al. Human-Level Control Through Deep Reinforcement Learning. Nature. 2015;518(7540):529–533. https://doi.org/10.1038/nature14236

13. Kohonen T. Self-Organizing Maps. Berlin, Heidelberg: Springer; 2001. 502 p. https://doi.org/10.1007/978-3-642-56927-2

Ustimov Maxim Gennadievich

Voronezh Institute of High Technologies

Voronezh, Russian Federation

Prokhorova Olga Konstantinovna
Candidate of Economic Sciences

Voronezh Institute of High Technologies

Voronezh, Russian Federation

Zalozhnikh Daniil Olegovich

Voronezh Institute of High Technologies

Voronezh, Russian Federation

Keywords: operation management, reinforcement learning, risk-based approach, clustering, energy efficiency, survivability of facilities, retail chains, digital twin

For citation: Ustimov M.G., Prokhorova O.K., Zalozhnikh D.O. Adaptive risk-based management of retail network facilities based on clusterization and training with reinforcements. Modeling, Optimization and Information Technology. 2025;13(4). URL: https://moitvivt.ru/ru/journal/pdf?id=2142 DOI: 10.26102/2310-6018/2025.51.4.066 (In Russ).

13

Full text in PDF

Received 27.11.2025

Revised 22.12.2025

Accepted 26.12.2025