Interpreted reinforcement learning to optimize the operational efficiency of enterprises in the context of digital transformation

Prokhorova O.K., Petrova E.S.

UDC 004.8:658.5.011.56
DOI: 10.26102/2310-6018/2025.50.3.001

Abstract
List of references
About authors

In the context of the digital transformation of education, MOOC platforms face the need to optimize operational processes while maintaining the quality of education. Traditional approaches to resource management often do not take into account complex temporal patterns of user behavior and individual learning characteristics. This paper proposes an innovative solution based on interpreted reinforcement learning (RL) integrated with the Shapley Value method to analyze the contribution of factors. The study demonstrates how data on activity time, user IDs, training goals, and other parameters can be used to train an RL agent capable of optimizing the allocation of platform resources. The developed approach allows: quantifying the contribution of each factor to operational efficiency; identifying hidden temporal patterns of user activity; and personalizing load management during peak periods. The article contains a mathematical justification of the method, practical implementation in MATLAB, as well as the results of testing, which showed a reduction in operating costs while increasing user satisfaction. Special attention is paid to the interpretability of the RL agent's decisions, which is critically important for the educational sphere. The work provides a ready-made methodology for the implementation of intelligent management systems in digital education, combining theoretical developments with practical recommendations for implementation. The results of the study open up new opportunities for improving the effectiveness of MOOC platforms in the face of growing competition in the educational technology market.

1. Boyko T.A. Qualitative and Quantitative Analysis of MOOC Platforms. Innovation & Investment. 2019;(11):175–180. (In Russ.).

2. Reich J., Ruipérez-Valiente J.A. The MOOC Pivot. Science. 2019;363(6423):130–131.

3. Caicedo J.C., Lazebnik S. Active Object Localization with Deep Reinforcement Learning. In: 2015 IEEE International Conference on Computer Vision (ICCV), 07–13 December 2015, Santiago, Chile. IEEE; 2015. P. 2488–2496. https://doi.org/10.1109/ICCV.2015.286

4. Rozemberczki B., Watson L., Bayer P., et al. The Shapley Value in Machine Learning. arXiv. URL: https://arxiv.org/abs/2202.05594v2 [Accessed 10th March 2025].

5. Sutton R.S., Barto A.G. Reinforcement Learning. Moscow: DMK Press; 2020. 552 p. (In Russ.).

6. Li X., Xu H., Zhang J., Chang H.-H. Deep Reinforcement Learning for Adaptive Learning Systems. arXiv. URL: https://arxiv.org/abs/2004.08410v1 [Accessed 10th March 2025].

7. Ashwini, Reddy K.V. Predicting the User Behavior Analysis using Machine Learning Algorithms. International Research Journal of Engineering and Technology (IRJET). 2020;7(7):1740–1746.

8. Schwartz H.M. Multi-Agent Machine Learning: A Reinforcement Approach. John Wiley & Sons, Inc.; 2014. 256 p.

9. Ivashkin Yu.A. Mul'tiagentnoe modelirovanie v imitatsionnoi sisteme Simplex3. Moscow: Laboratoriya znanii; 2016. 350 p. (In Russ.).

10. Sokolova E.S. Multi-Agent Approach to Modeling Inter-Module Interactions in a Stochastic Network Distributed Systems. Sistemy upravleniya i informatsionnye tekhnologii. 2020;(1):67–71. (In Russ.).

11. Shevskaya N.V. Explainable Artificial Intelligence and Methods for Interpreting Results. Modeling, Optimization and Information Technology. 2021;9(2). (In Russ.). https://doi.org/10.26102/2310-6018/2021.33.2.024

Prokhorova Olga Konstantinovna
Candidate of Economic Sciences

Voronezh Institute of High Technologies

Voronezh, Russian Federation

Petrova Elena Sergeevna

Voronezh State Technical University

Voronezh, Russian Federation

Keywords: reinforcement learning, shapley Value, operational efficiency, digital transformation, interpreted AI, business process optimization

For citation: Prokhorova O.K., Petrova E.S. Interpreted reinforcement learning to optimize the operational efficiency of enterprises in the context of digital transformation. Modeling, Optimization and Information Technology. 2025;13(3). URL: https://moitvivt.ru/ru/journal/pdf?id=1901 DOI: 10.26102/2310-6018/2025.50.3.001 (In Russ).

Full text in PDF

Received 14.04.2025

Revised 23.05.2025

Accepted 24.06.2025