Emotion analysis on video data using on-premise and cloud-based artificial intelligence solutions

idAgamirov L.V., idAgamirov V.L., Vestyak V., idToutova N.V., idBazunov S., idZelyanik Y.

UDC 004.89
DOI: 10.26102/2310-6018/2025.50.3.032

Abstract
List of references
About authors

The relevance of the study is due to the growing need for a highly accurate and interpretable emotion recognition system based on video data, which is crucial for the development of human-centered technologies in education, medicine, and human–computer interaction systems. In this regard, the article aims to identify the differences and application prospects of the local DeepFace solution and the cloud-based GPT-4o (OpenAI) model for analyzing short video clips with emotional expressions. Methodologically, the study is based on empirical comparative analysis: a moving average method was used to smooth the time series of emotional assessments and to evaluate stability and cognitive interpretability. The results showed that DeepFace provides stable local processing and high resistance to artifacts, while GPT-4o demonstrates the ability for complex semantic interpretation and high sensitivity to context. The effectiveness of a hybrid approach combining computational autonomy and interpretative flexibility is substantiated. Thus, the synergy of local and cloud solutions opens up prospects for creating more accurate, adaptive, and scalable affective analysis systems. The materials of the article are of practical value to specialists in the fields of affective computing, interface design, and cognitive technologies.

1. Serengil S.I., Ozpinar A. LightFace: A Hybrid Deep Face Recognition Framework. In: 2020 Innovations in Intelligent Systems and Applications Conference (ASYU), 15–17 October 2020, Istanbul, Turkey. IEEE; 2020. P. 1–5. https://doi.org/10.1109/ASYU50717.2020.9259802

2. Razzaq M.A., Hussain J., Bang J., et al. A Hybrid Multimodal Emotion Recognition Framework for UX Evaluation Using Generalized Mixture Functions. Sensors. 2023;23(9). https://doi.org/10.3390/s23094373

3. Goryachkin B.S., Kitov M.A. Komp'yuternoe zrenie. E-Scio. 2020;(9):317–345. (In Russ.).

4. Zhao X., Wang L., Zhang Yu., Han X., Deveci M., Parmar M. A Review of Convolutional Neural Networks in Computer Vision. Artificial Intelligence Review. 2024;57(4). https://doi.org/10.1007/s10462-024-10721-6

5. Kalateh S., Estrada-Jimenez L.A., Nikghadam-Hojjati S., Barata J. A Systematic Review on Multimodal Emotion Recognition: Building Blocks, Current State, Applications, and Challenges. IEEE Access. 2024;12:103976–104019. https://doi.org/10.1109/ACCESS.2024.3430850

6. Poria S., Majumder N., Hazarika D., Cambria E., Gelbukh A., Hussain A. Multimodal Sentiment Analysis: Addressing Key Issues and Setting Up the Baselines. IEEE Intelligent Systems. 2018;33(6):17–25. https://doi.org/10.1109/MIS.2018.2882362

7. Mujiyanto M., Setyanto A., Utami E., Kusrini K. Facial Expression Recognition with Deep Learning and Attention Mechanisms: A Systematic Review. In: 2024 7th International Conference on Informatics and Computational Sciences (ICICoS), 17–18 July 2024, Semarang, Indonesia. IEEE; 2024. P. 12–17. https://doi.org/10.1109/ICICoS62600.2024.10636857

8. Timofeeva O.P., Neimushchev S.A., Neimushcheva L.I., Tikhonov I.A. Facial Emotion Recognition Using Deep Neural Networks. Trudy NGTU im. R.E. Alekseeva. 2020;(1):16–24. (In Russ.). https://doi.org/10.46960/1816-210X_2020_1_16

9. Pascual A.M., Valverde E.C., Kim J.-I., et al. Light-FER: A Lightweight Facial Emotion Recognition System on Edge Devices. Sensors. 2022;22(23). https://doi.org/10.3390/s22239524

10. Barabanschikov V.A., Suvorova E.V. Human Emotional State Assessment Based on a Video Portrayal. Experimental Psychology (Russia). 2020;13(4):4–24. (In Russ.). https://doi.org/10.17759/exppsy.2020130401

Agamirov Levon Vladimirovich
Doctor of Engineering Sciences, Professor

ORCID | eLibrary |

National Research University "MPEI"
Moscow Technical University of Communications and Informatics, Moscow Aviation Institute

Moscow, Russian Federation

Agamirov Vladimir Levonovich
Candidate of Engineering Sciences

ORCID | eLibrary |

Moscow Technical University of Communications and Informatics
Moscow Aviation Institute

Moscow, Russian Federation

Vestyak Vladimir
Doctor of Physical and Mathematical Sciences, Docent

eLibrary |

Moscow Aviation Institute

Moscow, Russian Federation

Toutova Natalia Vladimirovna
Candidate of Engineering Sciences, Docent

ORCID | eLibrary |

Moscow Technical University of Communications and Informatics

Moscow, Russian Federation

Bazunov Sergei

ORCID |

BRICS University

Moscow, Russian Federation

Zelyanik Yulia

ORCID |

Moscow Aviation Institute

Moscow, Russian Federation

Keywords: affective computing, emotion recognition, video data analysis, deepFace, GPT-4o language model, hybrid analysis system, semantic text analysis, multimodal interaction, neural network interpretability, cognitive technologies

For citation: Agamirov L.V., Agamirov V.L., Vestyak V., Toutova N.V., Bazunov S., Zelyanik Y. Emotion analysis on video data using on-premise and cloud-based artificial intelligence solutions. Modeling, Optimization and Information Technology. 2025;13(3). URL: https://moitvivt.ru/ru/journal/pdf?id=1982 DOI: 10.26102/2310-6018/2025.50.3.032 (In Russ).

307

Full text in PDF

Received 02.06.2025

Revised 29.07.2025

Accepted 06.08.2025

Published 30.09.2025