Keywords: operation management, reinforcement learning, risk-based approach, clustering, energy efficiency, survivability of facilities, retail chains, digital twin
Adaptive risk-based management of retail network facilities based on clusterization and training with reinforcements
UDC 658.87:519.876.5
DOI: 10.26102/2310-6018/2025.51.4.066
In the context of increased operational and energy risks typical of modern retail chains, an innovative two-tier approach to facility operation management is proposed. The research is aimed at solving the key problem of heterogeneity of the risk profiles of network facilities, which requires differentiated management strategies instead of unified regulations. At the strategic level, intelligent clustering of objects using the Kohonen self-organizing maps (SOM) method has been implemented for complex risk factors, including geospatial parameters (distance from operational tension zones), infrastructural indicators (proximity to critical infrastructure, reliability of power grids), operational metrics (logistical stability, incident history) and socio-economic indicators. As a result of the cluster analysis, four clearly differentiated categories of objects were identified: critical, high-risk, logistically vulnerable and stable. At the tactical level, specialized Reinforcement Learning models have been developed for each cluster to adapt operational policies in real time. The formalization of the task as a Markov decision-making process made it possible to optimize control actions (maintenance, energy management, redundancy), taking into account the specific goals of the cluster. A key feature of the methodology is the customization of reward functions: priority is given to maximizing survivability for critical facilities, energy efficiency for stable ones, and balanced strategies for intermediate clusters. Experimental validation was performed on a synthesized dataset of 100 objects using modern machine learning libraries (Stable-Baselines3, Gymnasium, Scikit-learn) in a Docker WSL2 containerized environment.
1. Yu L., Qin Sh., Zhang M., Shen Ch., Jiang T., Guan X. A Review of Deep Reinforcement Learning for Smart Building Energy Management. IEEE Internet of Things Journal. 2021;8(15):12046–12063. https://doi.org/10.1109/JIOT.2021.3078462
2. Djenouri D., Laidi R., Djenouri Y., Balasingham I. Machine Learning for Smart Building Applications: Review and Taxonomy. ACM Computing Surveys. 2019;52(2). https://doi.org/10.1145/3311950
3. Pigott A., Crozier C., Baker K., Nagy Z. GridLearn: Multiagent Reinforcement Learning for Grid-Aware Building Energy Management. arXiv. URL: https://arxiv.org/pdf/2110.06396.pdf [Accessed 15th November 2025].
4. Mao R., Aggarwal V. NPSCS: Non-Preemptive Stochastic Coflow Scheduling with Time-Indexed LP Relaxation. IEEE Transactions on Network and Service Management. 2021;18(2):2377–2387. https://doi.org/10.1109/TNSM.2021.3051657
5. Al Sayed K., Boodi A., Broujeny R.S., Beddiar K. Reinforcement Learning for HVAC Control in Intelligent Buildings: A Technical and Conceptual Review. Journal of Building Engineering. 2024;95. https://doi.org/10.1016/j.jobe.2024.110085
6. Hillson D. Managing Risk in Projects. London: Routledge; 2016. 126 p.
7. Samunnisa K., Sunil Vijaya Kumar G., Madhavi K. Intrusion Detection System in Distributed Cloud Computing: Hybrid Clustering and Classification Methods. Measurement: Sensors. 2023;25. https://doi.org/10.1016/j.measen.2022.100612
8. Obasi I.Ch., Cheng P., Varianou-Mikellidou C., Dimopoulos Ch., Boustras G. Machine Learning for Occupational Accident Analysis: Applications, Challenges, and Future Directions. Journal of Safety Science and Resilience. 2026;7(1). https://doi.org/10.1016/j.jnlssr.2025.100250
9. Alhoniemi E., Hollmén J., Simula O., Vesanto J. Process Monitoring and Modeling Using the Self-Organizing Map. Integrated Computer Aided Engineering. 1998;6(1). https://doi.org/10.3233/ICA-1999-6102
10. Bouabdallaoui Y., Lafhaj Z., Yim P., Ducoulombier L., Bennadji B. Predictive Maintenance in Building Facilities: A Machine Learning-Based Approach. Sensors. 2021;21(4). https://doi.org/10.3390/s21041044
11. Schulman J., Wolski F., Dhariwal P., Radford A., Klimov O. Proximal Policy Optimization Algorithms. arXiv. URL: https://arxiv.org/abs/1707.06347 [Accessed 18th November 2025].
12. Mnih V., Kavukcuoglu K., Silver D., et al. Human-Level Control Through Deep Reinforcement Learning. Nature. 2015;518(7540):529–533. https://doi.org/10.1038/nature14236
13. Kohonen T. Self-Organizing Maps. Berlin, Heidelberg: Springer; 2001. 502 p. https://doi.org/10.1007/978-3-642-56927-2
Keywords: operation management, reinforcement learning, risk-based approach, clustering, energy efficiency, survivability of facilities, retail chains, digital twin
For citation: Ustimov M.G., Prokhorova O.K., Zalozhnikh D.O. Adaptive risk-based management of retail network facilities based on clusterization and training with reinforcements. Modeling, Optimization and Information Technology. 2025;13(4). URL: https://moitvivt.ru/ru/journal/pdf?id=2142 DOI: 10.26102/2310-6018/2025.51.4.066 (In Russ).
Received 27.11.2025
Revised 22.12.2025
Accepted 26.12.2025