Математическая модель универсальной системы управления шагающим роботом на основе методов обучения с подкреплением
Работая с нашим сайтом, вы даете свое согласие на использование файлов cookie. Это необходимо для нормального функционирования сайта, показа целевой рекламы и анализа трафика. Статистика использования сайта отправляется в «Яндекс» и «Google»
Научный журнал Моделирование, оптимизация и информационные технологииThe scientific journal Modeling, Optimization and Information Technology
Online media
issn 2310-6018

Mathematical model of a universal control system for a walking robot based on reinforcement learning methods

Kashko V.V.,  idOleinikova S.A.

UDC 519.857.3
DOI: 10.26102/2310-6018/2024.44.1.025

  • Abstract
  • List of references
  • About authors

Modern approaches to solving the problem of controlling walking robots with rotary links are disparate algorithms built either on a ready-made locomotor program with its further adaptation or on complex kinematic-dynamic models that require extensive knowledge about the dynamics of the system and the environment, which is often unfeasible in applied problems. Also, the approaches used are strictly related to the configuration of the walking robot, which makes it impossible to use the method in applications with a different configuration (a different number and type of limbs). This article proposes a universal approach to controlling the motion of walking robots based on reinforcement learning methodology. A mathematical model of a control system based on finite discrete Markov processes in the context of reinforcement learning methods is considered. The task is set to build a universal and adaptive control system capable of searching for the optimal strategy for implementing a locomotor program in a previously unknown environment through continuous interaction. The results distinguished by scientific novelty include a mathematical model of this system, which makes it possible to describe the process of its functioning using Markov chains. The difference from existing analogues is the unification of the description of the robot.

1. Paulo J., Asdadi A., Peixoto P., Amorim P. Human gait pattern changes detection system: A multimodal vision-based and novelty detection learning approach. Biocybernetics and Biomedical Engineering. 2017;37(4):701–717.

2. Shimmyo S., Sato T., Ohnishi K. Biped walking pattern generation by using preview control based on three-mass model. IEEE transactions on industrial electronics. 2012;60(11):5137–5147. DOI: 10.1109/TIE.2012.2221111.

3. Smith L., Kew J., Li T., Luu L., Peng X., Ha S., Tan J., Levine S. Learning and Adapting Agile Locomotion Skills by Transferring Experience. Robotics: Science and Systems XIX. 2023. DOI: 10.15607/RSS.2023.XIX.051 (accessed on 11.02.2024).

4. Braun D. J., Mitchell J. E., Goldfarb M. Actuated dynamic walking in a seven-link biped robot. IEEE/ASME Transactions on Mechatronics. 2010;17(1):147–156. DOI: 10.1109/TMECH.2010.2090891.

5. Bebek O., Erbatur K. A gait adaptation scheme for biped walking robots. The 8th IEEE International Workshop on Advanced Motion Control. 2004;409–414. DOI: 10.1109/AMC.2004.1297904.

6. Arakawa T., Fukuda T. Natural motion trajectory generation of biped locomotion robot using genetic algorithm through energy optimization. 1996 IEEE International Conference on Systems, Man and Cybernetics. Information Intelligence and Systems (Cat. No.96CH35929). 1996;2:1495–1500. DOI: 10.1109/ICSMC.1996.571368.

7. Luu T.P., Lim H.B., Hoon K.H., Qu X., Low K. H. Subject-specific gait parameters prediction for robotic gait rehabilitation via generalized regression neural network. 2011 IEEE International Conference on Robotics and Biomimetics. 2011;914–919. DOI: 10.1109/ROBIO.2011.6181404.

8. Ouyang W., Chi H., Pang J., Liang W., Ren Q. Adaptive Locomotion Control of a Hexapod Robot via Bio-Inspired Learning. Front Neurorobot. 2021;15:627157. DOI: 10.3389/fnbot.2021.627157.

9. Hrdlicka I., Kutilek P. Reinforcement learning in control systems for walking hexapod robots. Cybernetic Letters. 2005;3:1–13.

10. Fu H., Tang K., Li P., Zhang W., Wang X., Deng G., Wang T., Chen C. Deep Reinforcement Learning for Multi-contact Motion Planning of Hexapod Robots. Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence. 2021:2381–2388. DOI: 10.24963/ijcai.2021/328.

11. Geng T., Porr B., Wörgötter F. Fast biped walking with a sensor-driven neuronal controller and real-time online learning. The International Journal of Robotics Research. 2006;25(3):243–259.

12. Schilling M., Konen K., Ohl F.W., Korthals T. Decentralized Deep Reinforcement Learning for a Distributed and Adaptive Locomotion Controller of a Hexapod Robot. IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), Las Vegas, NV, USA; 2020. p. 5335–5342. DOI: 10.1109/IROS45743.2020.9341754.

13. Tien Y., Yang C., Hooman S. Reinforcement learning and convolutional neural network system for firefighting rescue robot. MATEC Web of Conferences. 2018;161. DOI:

14. 10.1051/matecconf/201816103028.

15. Satton R., Barto A. Reinforcement in learning: An Introduction. Second edition. Мoscow, DMK PRESS; 2020. 552 p. (In Russ.).

Kashko Vasily Vasilievich

Voronezh State Technical University

Voronezh, Russia

Oleinikova Svetlana Alexandrovna
Doctor of Technical Sciences, professor

WoS | ORCID | eLibrary |

Voronezh State Technical University

Voronezh, Russia

Keywords: control system, reinforcement learning, markov decision processes, neural networks, walking robot, artificial intelligence

For citation: Kashko V.V., Oleinikova S.A. Mathematical model of a universal control system for a walking robot based on reinforcement learning methods. Modeling, Optimization and Information Technology. 2024;12(1). URL: https://moitvivt.ru/ru/journal/pdf?id=1520 DOI: 10.26102/2310-6018/2024.44.1.025 (In Russ).

182

Full text in PDF

Received 15.02.2024

Revised 18.03.2024

Accepted 21.03.2024

Published 31.03.2024