Keywords: anomaly, outlier, time Series, three Sigma Rule, r Language
Detecting anomalies in multidimensional time series using the R package
UDC 004.4
DOI: 10.26102/2310-6018/2021.34.3.001
The task of finding anomalies in data when implementing predictive analytics systems. Predictive analytics have become very popular over the past few years. It helps banks approve loans or identify suspicious account activity, email providers filter spam, and retailers predict the likelihood of buying to attract customers. But predictive analytics is quite complex, and therefore its implementation is also fraught with difficulties. When companies take the traditional approach to predictive analytics (that is, treat it like any other type of analytics), they often face obstacles. This is why this area needs tools to detect anomalies in the data. These tools should help to identify outstanding values in order to draw dependencies with the factors of their occurrence and identify them in the future. This article describes a package in the R language that is anomalies in multidimensional time series. This package is capable of detecting anomalies using three different methods: the n-sigma method, the CUSUM method, and the 4th order central moment method. Also, this package searches for complex anomalies, which are a direct indicator of errors in the system due to the fact that anomalies are found in multidimensional data.
1. Antonenko S.V. The effectiveness of machine learning-based banking security systems. Materials of the II International Scientific and Practical Conference "Trends and Prospects for the Development of the Banking System in Modern Economic Conditions" Bryansk, December 17-18, 2019 - Bryansk State University named after Academician I.G. Petrovsky, 2020; 97-102.
2. Time series (Time series data) [Electronic resource]. Loginom. Available at: https://wiki.loginom.ru/articles/time-series.html (accessed 12.03.2021).
3. Rafiqul I., Naznin S., Mohammad Ali M., Prohollad S., Bushra R. A Comprehensive Survey of Time Series Anomaly Detection in Online Social Network Data. International Journal of Computer Applications. 2017;180(3):13-22.
4. Dictionary of Economics and Mathematics - Multidimensional time series [Electronic resource]. Academic. Available at: https://economic_mathematics.academic.ru/2615/Многомерные_временные_ряды (accessed 12.03.2021).
5. Van Cuong S., Shcherbakov М.V. A data-driven method for remaining useful life prediction of multiple-component systems. Caspian J.: Control and High Technologies. 2019;1:33–44.
6. Chandola V., Banerjee A., Kumar V. Anomaly detection: A survey. ACM Computing Surveys. 2009;41(3).
7. Effective Approaches for Time Series Anomaly Detection [Electronic resource]. Towards data science. – Mode of access: https://towardsdatascience.com/effective-approaches-for-time-series-anomaly-detection-9485b40077f1 (date of access 12.03.2021).
8. Anomaly Detection [Electronic resource]. Academic. Available at: https://dyakonov.org/2017/04/19/поиск-аномалий-anomaly-detection/ (accessed 12.03.2021).
9. He Q., Zheng Y. J., Zhang C.L., Wang H. Y., "MTAD-TF: Multivariate Time Series Anomaly Detection Using the Combination of Temporal Pattern and Feature Pattern". Complexity; 2020;2020. DOI: https://doi.org/10.1155/2020/8846608.
10. Ephimov A.I. Methods for using neural networks to assess and enhance the photorealism of virtual reality. IVD. 2019;3(54). Available at: https://cyberleninka.ru/article/n/metody-primeneniya-neyronnyh-setey-dlya-otsenki-i-povysheniya-fotorealistichnosti-virtualnoy-realnosti (accessed 12.03.2021).
Keywords: anomaly, outlier, time Series, three Sigma Rule, r Language
For citation: Rayushkin E.S., Shcherbakov M.V., Kazakov I.D., Kolesnikova V.O. Detecting anomalies in multidimensional time series using the R package. Modeling, Optimization and Information Technology. 2021;9(3). URL: https://moitvivt.ru/ru/journal/pdf?id=948 DOI: 10.26102/2310-6018/2021.34.3.001 (In Russ).
Received 14.03.2021
Revised 20.08.2021
Accepted 31.08.2021
Published 30.09.2021