Архитектура распределенной системы мультимодального анализа медицинских данных на основе вариационного семантического выравнивания
Работая с сайтом, я даю свое согласие на использование файлов cookie. Это необходимо для нормального функционирования сайта, показа целевой рекламы и анализа трафика. Статистика использования сайта обрабатывается системой Яндекс.Метрика
Научный журнал Моделирование, оптимизация и информационные технологииThe scientific journal Modeling, Optimization and Information Technology
Online media
issn 2310-6018

Architecture of a distributed multimodal medical data analysis system based on variational semantic alignment

Pozharsky R.V.,  idRyndin A.A.

UDC 004.89:616-073
DOI: 10.26102/2310-6018/2026.56.5.010

  • Abstract
  • List of references
  • About authors

The article presents the architecture of a distributed system for intelligent analysis of multimodal medical data (DICOM images and text reports), combining theoretical methods of variational inference with modern MLOps engineering practices. The key problem addressed is the integration of heterogeneous data (DICOM imaging studies and text clinical reports) under real-world time and computational constraints. The main scientific contribution lies in the formalization and implementation of a new semantic alignment criterion conditioned on unobserved clinically significant latent factors. This criterion, maximized using variational inference (Evidence Lower Bound), ensures deep integration of modalities based on a common pathophysiological basis rather than superficial correlations. On the practical side, a fault-tolerant distributed infrastructure based on Docker, Apache Spark, MinIO, and MLflow has been developed and deployed, providing a complete data lifecycle –from storage and distributed processing to experiment tracking. For adaptive load management, a reinforcement learning-based controller is proposed and implemented, formalizing patient routing between fast (deterministic algorithms) and deep (full ViT+BERT models) pipelines as a partially observable Markov decision process (POMDP). The architectural framework and mathematical model of variational semantic alignment are presented. Experiments on synthetic data confirmed the correctness of the software implementation in the WSL2/Docker environment and the efficient interaction of Spark and MinIO components. The next stage of research will be scaling the system to the full MIMIC-CXR dataset for clinical validation of the proposed hypotheses.

1. Basystiuk O., Melnykova N. Multimodal Medical Data Learning Approaches for Digital Healthcare. In: Proceedings of the 6th International Conference on Informatics & Data-Driven Medicine, 17–19 November 2023, Bratislava, Slovakia. CEUR Workshop Proceedings; 2024. P. 332–337.

2. Yarushkina N.G., Andreev I.A., Guskov G.Yu., et al. Intelligent predictive multimodal analysis of poorly structured big data. Ulyanovsk: UlGTU; 2020. 220 p. (In Russ.).

3. Bhosekar Sh., Singh P., Garg D., Ravi V., Diwakar M. A Review of Deep Learning-based Multi-modal Medical Image Fusion. The Open Bioinformatics Journal. 2025;18. https://doi.org/10.2174/0118750362370697250630063814

4. Guo Z., Li X., Huang H., Guo N., Li Q. Deep Learning-based Image Segmentation on Multimodal Medical Imaging. IEEE Transactions on Radiation and Plasma Medical Sciences. 2019;3(2):162–169. https://doi.org/10.1109/TRPMS.2018.2890359

5. Tunstall L., von Werra L., Wolf Th. Natural Language Processing with Transformers: Building Language Applications with Hugging Face. Sebastopol: O'Reilly Media; 2022. 479 p.

6. Johnson A.E.W., Pollard T.J., Berkowitz S.J., et al. MIMIC-CXR, a de-identified publicly available database of chest radiographs with free-text reports. Scientific Data. 2019;6. https://doi.org/10.1038/s41597-019-0322-0

7. Bondarenko A.S., Zaytsev K.S. Using container management systems to build distributed cloud information systems with microservice architecture. International Journal of Open Information Technologies. 2023;11(8):17–23. (In Russ.).

8. Razumovskii D.A., Volkov D.D., Stuchilin V.V. Architecture of a system for collecting and storing metrics on the resource usage of Spark applications in clustered big data processing systems. International Research Journal. 2025;(12). (In Russ.). https://doi.org/10.60797/IRJ.2025.162.81 

9. Khomonenko A.D., Abou Hasan R. About the reliability and availability of object data stores. Intellectual Technologies on Transport. 2023;(S1):123–128. (In Russ.).

10. Starikov A., Namiot D. Machine learning model serving system for event streams. International Journal of Open Information Technologies. 2020;8(7):57–75. (In Russ.).

Pozharsky Roman Vitalievich

Voronezh Institute of High Technologies

Voronezh, Russian Federation

Ryndin Alexandr Alexeevich
Doctor of Engineering Sciences, Professor

Scopus | ORCID | eLibrary |

Voronezh Institute of High Technologies

Voronezh, Russian Federation

Keywords: multimodal analysis, variational inference, semantic alignment, distributed computing, reinforcement learning, medical data, DICOM, MLOps

For citation: Pozharsky R.V., Ryndin A.A. Architecture of a distributed multimodal medical data analysis system based on variational semantic alignment. Modeling, Optimization and Information Technology. 2026;14(5). URL: https://moitvivt.ru/ru/journal/article?id=2229 DOI: 10.26102/2310-6018/2026.56.5.010 (In Russ).

© Pozharsky R.V., Ryndin A.A. Статья опубликована на условиях лицензии Creative Commons Attribution-NonCommercial 4.0 International (CC BY-NS 4.0)
4

Full text in PDF

Скачать JATS XML

Received 15.02.2026

Revised 17.04.2026

Accepted 11.05.2026