Keywords: machine learning, medical diagnostics, atherosclerosis risk prediction, recurrent neural network LSTM, SHAP
Development of models for predicting atherosclerosis risk using machine learning methods
UDC 004.8
DOI: 10.26102/2310-6018/2021.33.2.023
Atherosclerosis is one of the most common and life-threatening diseases that can develop at an early age. At the initial stages, atherosclerosis is difficult to detect; therefore, its diagnosis requires the use of timely approaches, in particular, using machine learning methods. In the proposed study, models and algorithms are developed for calculating the risk of developing atherosclerosis of the main arteries, depending on the initial clinical characteristics of patients. As a training dataset, a sample of the inter-national MIMIC-III database was used, which has a structure of time series sequences, for which the recurrent deep neural networks of the LSTM architecture were used. In the course of solving the prob-lem of predicting atherosclerosis using SHAP models, the main significant features most associated with the risk of developing this disease were identified. In the course of this study, a comparative analysis of a neural network model trained on MIMIC-III data was carried out with a model for calcu-lating the risk of atherosclerosis, developed using a regional dataset obtained as a result of examining patients in the Voronezh region as part of the general medical examination program. The quality of the developed models was assessed using the indicators of sensitivity, specificity and ROC-AUC. In the course of the study, the similarities and differences of the developed models were identified, concern-ing both the features included in the initial data sets and the predictors associated with a high risk of atherosclerosis.
1. Johnson A., Pollard T, Shen L. MIMIC-III, a freely accessible critical care database. Sci Data. 2016;3:160035. DOI: 10.1038/sdata.2016.35.
2. Harutyunyan H., Khachatrian H., Kale D. C. et. al. Multitask learning and benchmarking with clinical time series data. Sci Data. 2019;6(96). DOI: 10.1038/s41597-019-0103-9.
3. Komorowski M., Celi L., Badawi O., Gordon A. and Faisal A. The Artificial Intelligence Clinician learns optimal treatment strategies for sepsis in intensive care. Nature Medicine. 2018; 24(11):1716-1720. DOI: 10.1038/s41591-018-0213-5.
4. Khokhlov R., Gaydashev A. and Akhmedzhanov N. Predictors of atherosclerotic lesions of limb arteries according to cardioangiological screening of the adult population. Rational Pharmacotherapy in Cardiology. 2015;11(5):470-476. DOI: 10.20996/1819-6446-2015-11-5-470-476.
5. Khokhlov R., Ostroushko N., Gaydashev A., Kirsanov D. and Akhmedzhanov N. Multi-channel volume sphygmography in cardioangiological screening of the adult population. Rational Pharmacotherapy in Cardiology. 2015;11(4):371-379. DOI: 10.20996/1819-6446-2015-11-4-371-379.
6. Demchenko M., Kashirina I. The development of the atherosclerosis diagnostic models under conditions of unbalanced classes. Journal of Physics: Conference Series. 2020;1479:012026. DOI: 10.1088/1742-6596/1479/1/012026.
7. Hochreiter S., Schmidhuber J. Long short-term memory. Neural Computation. 1997;9(8):1735-1780. DOI:10.1162/neco.1997.9.8.1735.
8. Chollet F. Deep learning with Python. SPb: Piter, 2018:400.
9. Geron О. Hands-On Machine Learning with Scikit-Learn, Keras, and TensorFlow. CA 95472: O’Reilly Media, Inc, 2019.
10. Shapley L.S. Notes on the n-Person Game -- II: The Value of an n-Person Game. Santa Monica, CA: RAND Corporation, 1951.
11. Ferreira A. Interpreting recurrent neural networks on multivariate time series. Available at: https://towardsdatascience.com/interpreting-recurrent-neural-networks-on-multivariate-time-series-ebec0edb8f5a (accessed 04.04.2021).
12. Molnar C. Interpretable machine learning. A Guide for Making Black Box Models Explainable. Available at: https://christophm.github.io/interpretable-ml-book/ (accessed 04.04.2021).
Keywords: machine learning, medical diagnostics, atherosclerosis risk prediction, recurrent neural network LSTM, SHAP
For citation: Kashirina I.L., Firyulina M.A., Demchenko M.V. Development of models for predicting atherosclerosis risk using machine learning methods. Modeling, Optimization and Information Technology. 2021;9(2). URL: https://moitvivt.ru/ru/journal/pdf?id=993 DOI: 10.26102/2310-6018/2021.33.2.023 (In Russ).
Accepted 30.07.2021
Published 30.06.2021