Моделирование и оптимизация процесса сбора данных для искусственного интеллекта в медицине
Работая с сайтом, я даю свое согласие на использование файлов cookie. Это необходимо для нормального функционирования сайта, показа целевой рекламы и анализа трафика. Статистика использования сайта обрабатывается системой Яндекс.Метрика
Научный журнал Моделирование, оптимизация и информационные технологииThe scientific journal Modeling, Optimization and Information Technology
Online media
issn 2310-6018

Modeling and optimization of data collection process for artificial intelligence in medicine

idIvaschenko A.V., idTerekhin M.A., idPoretskova G.Y., idZhdanovich G.E., idMelnikov D.A., idRadaev D.E.

UDC 004.89
DOI: 10.26102/2310-6018/2026.55.4.020

  • Abstract
  • List of references
  • About authors

Development of Artificial Intelligence technologies in medicine requires a systematic approach to collecting and processing structured datasets for training, testing, and validating machine learning models. This paper proposes a solution to this problem through simulation modeling based on queueing theory. This modeling requires estimating the planned throughput of each data collection point, ensuring a sufficient number of patients, the availability and reliability of their medical information, and meeting legal requirements regarding personal data protection and medical ethics. The proposed approach was studied using the analysis of biomedical data collection processes designed to train artificial intelligence models for remote diagnostic methods. The empirical part of the study was conducted at biomedical signal collection points over a six-month period. The total sample size was 574 patients. A simulation model was developed to optimize the data collection process. According to the simulation modeling, the average data collection intensity was 7.28 patients per day with significant variability in the workload. During the optimization process, changes were made to the data collection process through parallelization, which increased productivity by reducing the time spent on questionnaires and temperature measurements and increasing patient throughput. The optimization of the data collection process increased the workload from 4.67 to 12.12 patients per day. The proposed approach allows us to validate the architecture of the organizational and technological process for data collection before scaling and minimizes the risk of exceeding the schedule deadlines for generating medical datasets.

1. Reshetnikov R.V., Tyrov I.A., Vasilev Yu.A., et al. Assessing the quality of large generative models for basic healthcare applications. Medical Doctor and Information Technologies. 2025;(3):64–75. (In Russ.). https://doi.org/10.25881/18110193_2025_3_64

2. Vasilev Y.A., Bobrovskaya T.M., Arzamasov K.M., et al. Medical datasets for machine learning: fundamental principles of standartization and systematization. Manager Zdravoohranenia. 2023;(4):28–41. (In Russ.). https://doi.org/10.21045/1811-0185-2023-4-28-41

3. Sharova D.E., Mikhailova A.A., Gusev A.V., et al. An analysis of global experience in regulations on the use of medical data for artificial intelligence systems development based on machine learning. Medical Doctor and Information Technologies. 2022;(4):28–39. (In Russ.). https://doi.org/10.25881/18110193_2022_4_28

4. Arora A., Alderman J.E., Palmer J., et al. The value of standards for health datasets in artificial intelligence-based applications. Nature Medicine. 2023;29(11):2929–2938. https://doi.org/10.1038/s41591-023-02608-w

5. Schwabe D., Becker K., Seyferth M., Klaß A., Schaeffter T. The METRIC-framework for assessing data quality for trustworthy AI in medicine: a systematic review. npj Digital Medicine. 2024;7(1). https://doi.org/10.1038/s41746-024-01196-4

6. Kim J.-W., Kim Ch., Kim K.-H., et al.  Scalable Infrastructure Supporting Reproducible Nationwide Healthcare Data Analysis toward FAIR Stewardship. Scientific Data. 2023;10(1). https://doi.org/10.1038/s41597-023-02580-7

7. Barseghyan N.V., Galimulina F.F. Digital modeling and optimization of economic systems: queuing theory and data analysis. Kursk: Universitetskaya kniga; 2025. 82 p. (In Russ.).

8. Slobodnyak I.A., Antipina P.V. Optimize the organization of the accounting service and other service functions using the theory of management of mass service systems. Ekonomika i upravlenie: problemy, resheniya. 2020;1(12):19–24. (In Russ.). https://doi.org/10.36871/ek.up.p.r.2020.12.01.004

9. Polukhin P.V. Application of queueing theory methods for estimating synchronization parameters of distributed computing systems. Modeling, Optimization and Information Technology. 2022;10(2). (In Russ.). https://doi.org/10.26102/2310-6018/2022.37.2.028

10. Tretyakova M.E., Smakuev A.J., Filatov V.V. Designing the process of providing services based on the methods of the theory of queuing. Applied economic research. 2022;(2):24–31. (In Russ.). https://doi.org/10.47576/2313-2086_2022_2_24

11. Touré V., Krauss Ph., Gnodtke K., et al. FAIRification of health-related data using semantic web technologies in the Swiss Personalized Health Network. Scientific Data. 2023;10. https://doi.org/10.1038/s41597-023-02028-y

12. Fun W.H., Tan E.H., Khalid R., et al. Applying Discrete Event Simulation to Reduce Patient Wait Times and Crowding: The Case of a Specialist Outpatient Clinic with Dual Practice System. Healthcare. 2022;10(2). https://doi.org/10.3390/healthcare10020189

13. Vecillas Martin D., Berruezo Fernández Ch., Gento Municio A.M. Systematic Review of Discrete Event Simulation in Healthcare and Statistics Distributions. Applied Sciences. 2025;15(4). https://doi.org/10.3390/app15041861

14. Di Pumpo M., Ianni A., Miccoli G.A., et al. Queueing Theory and COVID-19 Prevention: Model Proposal to Maximize Safety and Performance of Vaccination Sites. Frontiers in Public Health. 2022;10. https://doi.org/10.3389/fpubh.2022.840677

15. Kuruppu Appuhamilage G.D.K., Hussain M., Zaman M., Khan W.A. A health digital twin framework for discrete event simulation based optimised critical care workflows. npj Digital Medicine. 2025;8(1). https://doi.org/10.1038/s41746-025-01738-4

16. Declerck J., Kalra D., Vander Stichele R., Coorevits P. Frameworks, Dimensions, Definitions of Aspects, and Assessment Methods for the Appraisal of Quality of Health Data for Secondary Use: Comprehensive Overview of Reviews. JMIR Medical Informatics. 2024;12. https://doi.org/10.2196/51560

Ivaschenko Anton Vladimirovich
Doctor of Engineering Sciences, Professor

ORCID |

Samara State Medical University

Samara, Russian Federation

Terekhin Mikhail Aleksandrovich

Scopus | ORCID |

Samara State Medical University

Samara, Russian Federation

Poretskova Galina Yuryevna
Doctor of Medical Sciences, Docent

ORCID |

Samara State Medical University

Samara, Russian Federation

Zhdanovich German Eduardovich

ORCID |

Volga State Transport University

Samara, Russian Federation

Melnikov Denis Alexeyevich

ORCID |

Penza State Technological University

Penza, Russian Federation

Radaev Dmitry Evgenievich

ORCID |

Penza State Technological University

Penza, Russian Federation

Keywords: medical dataset, simulation modeling, queueing theory, digital twin, throughput, artificial intelligence

For citation: Ivaschenko A.V., Terekhin M.A., Poretskova G.Y., Zhdanovich G.E., Melnikov D.A., Radaev D.E. Modeling and optimization of data collection process for artificial intelligence in medicine. Modeling, Optimization and Information Technology. 2026;14(4). URL: https://moitvivt.ru/ru/journal/article?id=2232 DOI: 10.26102/2310-6018/2026.55.4.020 (In Russ).

© Ivaschenko A.V., Terekhin M.A., Poretskova G.Y., Zhdanovich G.E., Melnikov D.A., Radaev D.E. Статья опубликована на условиях лицензии Creative Commons Attribution-NonCommercial 4.0 International (CC BY-NS 4.0)
14

Full text in PDF

Скачать JATS XML

Received 16.02.2026

Revised 14.04.2026

Accepted 21.04.2026

Published 30.04.2026