Метод распознавания эмоций человека по двигательной активности тела в видеопотоке на основе нейронных сетей
Научный журнал Моделирование, оптимизация и информационные технологииThe scientific journal Modeling, Optimization and Information Technology
issn 2310-6018

Method of human emotion recognition through analysis of body motor activity in a video stream using neural networks.

idUzdiaev M.Y. idDudarenko D.M. Mironov V.N.  

UDC 004.032.26
DOI: 10.26102/2310-6018/2021.32.1.004

This paper presents the use of various neural network models to solve the problem of human emotion recognition by the motor activity of his body on frames of a video stream without complex preprocessing of these frames. The paper presents three-dimensional convolutional neural networks: Inception 3D (I3D), Residual 3D (R3D), as well as convolutional-recurrent neural network architectures using the convolutional neural network of the ResNet architecture and recurrent neural networks of the LSTM and GRU architectures (ResNet + LSTM, ResNet + GRU) which do not require preliminary processing of images or video stream and at the same time potentially allow achieving high accuracy of emotion recognition. Based on the considered architectures, a method for human emotion recognition from the motor activity of the body in a video stream is proposed. Architectural features of the used models, methods of processing video stream frames by models, as well as the results of emotion recognition according to the following quality metrics: the proportion of correctly recognized instances (accuracy), precision, recall are discussed. Approbation results of the proposed neural network models I3D, R3D, ResNet + LSTM, ResNet + GRU on the FABO data set showed a high quality of emotion recognition based on the motor activity of the human body. Thus, the R3D model showed the best share of correctly recognized copies, equal to 91%. Other proposed models: I3D, ResNet + LSTM, ResNet + GRU showed 88%, 80% and 80% recognition accuracy, respectively. Therefore, according to the obtained results of the experimental evaluation of the proposed neural network models, the most preferable for use in solving the problem of a person's emotional state recognition by motor activity, from the point of view of a set of indicators of the accuracy of emotion classification, are three-dimensional convolutional models I3D and R3D. At the same time, the proposed models, in contrast to most existing solutions, make it possible to implement emotion recognition based on the analysis of RGB frames of a video stream without performing their preliminary resource-consuming processing, as well as to perform emotion recognition in real-time with high accuracy.

Uzdiaev Mikhail Yurievich

Email: m.y.uzdiaev@gmail.com


St. Petersburg Federal Research Center of the Russian Academy of Sciences (SPC RAS), St. Petersburg Institute for Informatics and Automation of the Russian Academy of Sciences

Saint-Petersburg, Russian Federation

Dudarenko Dmitry Mikhailovich


St. Petersburg Federal Research Center of the Russian Academy of Sciences (SPC RAS), St. Petersburg Institute for Informatics and Automation of the Russian Academy of Sciences

Saint-Petersburg, Russian Federation

Mironov Viktor Nikolaevich

St. Petersburg Federal Research Center of the Russian Academy of Sciences (SPC RAS), St. Petersburg Institute for Informatics and Automation of the Russian Academy of Sciences

Saint-Petersburg, Russian Federation

Keywords: neural network model, emotion recognition, convolutional neural networks, machine learning, image processing, video stream

For citation: Uzdiaev M.Y. Dudarenko D.M. Mironov V.N. Method of human emotion recognition through analysis of body motor activity in a video stream using neural networks.. Modeling, Optimization and Information Technology. 2021;9(1). Available from: https://moitvivt.ru/ru/journal/pdf?id=929 DOI: 10.26102/2310-6018/2021.32.1.004 (In Russ).


Revised 15.02.2021

Published 31.03.2021