Keywords: machine learning, survival analysis, temporal Quilting, bayesian optimization, myocardial infarction
Application of "temporal quilting" method for survival analysis after myocardial infarction
UDC 314.48
DOI: 10.26102/2310-6018/2021.35.4.028
The importance of survival analysis in medical problems has led to development of a variety of approaches to modeling the survival function. Models built with various machine learning methods have strengths and weaknesses in terms of differential performance and calibration capabilities, but no model is most suitable for all datasets or even all-time horizons within a single dataset. The relevance of the research is due to the fact that basic models and ensemble approaches do not always make it possible to build a proper survival model for different time horizons. Because of that, this article aims to outline the application of a new approach that combines various basic models to create a reliable survival function, providing opportunities for fine tuning and having good discriminant characteristics in different time horizons. During the course of the study, six basic models for analyzing survival after myocardial infarction were described: nonparametric methods (Cox proportional hazards model, Cox proportional hazards model using ridge regression), parametric models (logistic normal distribution model, logistic exponential distribution model, Weibull distribution method) and ensemble model (random forest). The principal approach to solving this problem is the use of an improved method – temporal quilting. In this study, the aforementioned approach is compared to basic methods in relation to accuracy and assessment of model calibration. The research results have revealed that ‘temporal quilting’ model is the most efficient while random forest model appears to be the least efficient. Since the enhanced approach automatically finds the approximation of the best-suited survival model, it enables clinicians to reduce time spent on the search for one specific survival model for each dataset as well as for each relevant all-time horizon.
1. Greg Ridgeway. The state of boosting. Computing Science and Statistics. 1999;31:172–181.
2. Hosmer DW, Lemeshow S, May S. Applied survival analysis regression modeling of time-to-event data, 2nd ed. Hoboken, NJ: Wiley-Interscience; 2008. 2006 p.
3. Austin P. Generating survival times to simulate cox proportional hazards models with time-varying covariates. Statistics in medicine. 2012; 31(29):3946–3958. DOI: 10.1002/sim.5452
4. Katzman J., Shaham U., Bates J. Deep survival: A deep cox proportional hazards network. BMC Medical Research Methodology. 2016;18(24):1–15. DOI: 10.1186/s12874-018-0482-1
5. Ahmed M., Mihaela van der Schaar. Deep multi-task gaussian processes for survival analysis with competing risks. In Proceedings of the 31st International Conference on Neural Information Processing Systems (NIPS'17). Curran Associates Inc., Red Hook, NY, USA; 2017:2326–2334.
6. Bellot A., Mihaela van der Schaar. Boosted trees for risk prognosis. In Proceedings of the 3st Machine Learning for Healthcare Conference (MLHC 2018). 2018; PMLR (85):2–16.
7. Taser PY. Application of Bagging and Boosting Approaches Using Decision Tree-Based Algorithms in Diabetes Risk Prediction. Proceedings. 2021;74(1):6. DOI: 10.3390/proceedings2021074006
8. Lee C, Zame W, Yoon J, van der Schaar M. DeepHit: A Deep Learning Approach to Survival Analysis With Competing Risks. 2018;32(1). Available from: https://ojs.aaai.org/index.php/AAAI/article/view/11842 (accessed 01.10.2021)
9. Spooner A., Chen E., Sowmya A. A comparison of machine learning methods for survival analysis of high-dimensional clinical data for dementia prediction. Sci Rep 2020;10:20410. DOI: 10.1038/s41598-020-77220-w
10. Firyulina M., Bondarenko Yu., Desyatirikova E. Identification of Risk Factors for Mortality after Myocardial Infarction Using Machine Learning Methods. Proc. of 2021 24th International Conference on Soft Computing and Measurements. SCM. 2021. DOI: 10.1109/SCM52931.2021.9507190
11. Lee C., Zame W., Alaa A. Temporal Quilting for Survival Analysis. Proceedings of the Twenty-Second International Conference on Artificial Intelligence and Statistics in Proceedings of Machine Learning Research, PMLR. 2019; 89:596–605.
12. Naeini MP, Cooper GF, Hauskrecht M. Obtaining Well Calibrated Probabilities Using Bayesian Binning. Proc Conf AAAI Artif Intell. 2015;2015:2901–2907.
13. Guo C., Pleiss G., Kilian Q. On calibration of modern neural networks. In Proceedings of the 34th International Conference on Machine Learning. Weinberger. 2017:1321–1330.
14. Niculescu-Mizil, Alexandru & Caruana, Rich. (2005). Predicting good probabilities with supervised learning. ICML 2005 - Proceedings of the 22nd International Conference on Machine Learning. 2005:625–632. DOI:10.1145/1102351.1102430.
15. Merkle, Edgar & Hartman, R. Weighted Brier score decompositions for topically heterogenous forecasting tournaments. Judgment and Decision Making. 2018;13(2):185–201.
16. Firyulina M.A., Kashirina I.L. Classification of cardiac arrhythmia using machine learning techniques. Journal of physics: Applied Mathematics, Computational Science and Mechanics: Current Problems. 2019:167–1175.
17. Kashirina I., Firyulina M. Building models for predicting mortality after myocardial infarction in conditions of unbalanced classes, including the influence of weather conditions. CEUR Workshop Proceedings. 2020;2790:188–197
18. Jasper S., Larochelle H., Adams R. Practical Bayesian Optimization of Machine Learning Algorithms. Curran Associates, Inc.; 2012. 25 p.
19. Feurer M., Hutter F. Automated Machine Learning. Cham, The Springer Series on Challenges in Machine Learning; 2019. 223 p.
20. Hutter F., Hoos H., Leyton-Brown K. Sequential Model-Based Optimization for General Algorithm Configuration. Lecture Notes in Computer Science. 2011;6683:507–523. DOI:10.1007/978-3-642-25566-3_40
21. Thornton C., Hutter F., Hoos H. Auto-WEKA: Combined selection and hyperparameter optimization of classification algorithms. Knowledge Discovery and Data Mining. 2013; 6683:847–855. DOI:10.1145/2487575.2487629
Keywords: machine learning, survival analysis, temporal Quilting, bayesian optimization, myocardial infarction
For citation: Firyulina M.A. Application of "temporal quilting" method for survival analysis after myocardial infarction. Modeling, Optimization and Information Technology. 2021;9(4). URL: https://moitvivt.ru/ru/journal/pdf?id=1080 DOI: 10.26102/2310-6018/2021.35.4.028 .
Received 10.11.2021
Revised 22.12.2021
Accepted 29.12.2021
Published 31.12.2021