Keywords: monitoring system, time series, IT service, resource-service model, service management system, AIOps, big data
AIOps system software module architecture for collecting and processing IT infrastructure time series
UDC УДК 004.042
DOI: 10.26102/2310-6018/2023.42.3.027
The article examines the problem of collecting time series data by AIOps system for monitoring the IT infrastructure with subsequent processing of the received data in real time. The relevance of the study is due to the growing interest in systems of this class on the part of large enterprises and organizations with a high degree of production process digitalization. In its turn, the organization of the process of collecting such information is conditioned by a number of features: firstly, software modules must be designed taking into account a significant load (collection and processing of about 10 million metrics per minute); secondly, end devices are not often used to collect data, other monitoring systems are employed instead. It is also required to consider the current state of the IT infrastructure characterized by its dynamism caused by the development and widespread implementation of hardware virtualization technologies, application containerization and automated configuration management. Based on a comparison of approaches to the collection and processing of time series data implemented in various monitoring tools, the paper concludes that the application and development of the Prometheus approach in AIOps monitoring systems is promising. The authors offer their own version of the adaptation and development of this approach. Distinctive features of the proposed option are a multi-status model of thresholds with a lifetime as well as the indirect establishment of links between objects in the resource-service model and the collected metric information, which helps to implement the functionality required by enterprises for collecting and processing metrics for an AIOps monitoring system under the conditions of high load and dynamism of modern IT infrastructure. In conclusion, the results of the developed software module preliminary testing are presented, and the possibility of using the approach proposed by the authors to implement the function of controlling the degree of monitoring object coverage is underscored. Currently, the described version of the architecture is used in the commercial software product "MONQ" and is being tested in several key Russian enterprises.
1. Bolshakov M.A., Mikhailov G.V. Monitoring tools for the IT infrastructure of the Main Computing Center of Russian Railways. Scientific developments: the Eurasian region: international scientific conference of theoretical and applied developments, 20 May 2019, Moscow. Ufa, Infinity; 2019. p. 225–230. (In Russ.).
2. Gorshkov S. Three steps to data-centric architecture. Otkrytye sistemy. SUBD = Open systems. DBMS. 2019;(4):26. (In Russ.).
3. Lazareva N.B. The optimal choice of monitoring system for various types of IT infrastructures. Inzhenernyj vestnik Dona = Engineering journal of Don. 2022;4:60–69. (In Russ.).
4. Ivanova E.V., Tsymbler M.L. Review of modern time series processing systems. Vestnik YuUrGU. Seriya: Vychislitel'naya matematika i informatika = Vestnik YUrGU. Series: Computational Mathematics and Informatics. 2020;9(4):79–97. (In Russ.).
5. Namiot D.E. Time series databases in the "Internet of Things" systems. Prikladnaya informatika = Journal of Applied Informatics. 2017;12(2):79–87. (In Russ.).
6. Valialkin A. High-cardinality TSDB benchmarks: VictoriaMetrics vs TimescaleDB vs InfluxDB. Medium. 2018. URL: https://valyala.medium.com/high-cardinality-tsdb-benchmarks-victoriametrics-vs-timescaledb-vs-influxdb-13e6ee64dd6b [accessed on 04.10.2023].
7. Valialkin A. Measuring vertical scalability for time series databases in Google Cloud. Medium. 2019. URL: https://valyala.medium.com/measuring-vertical-scalability-for-time-series-databases-in-google-cloud-92550d78d8ae [accessed on 04.10.2023].
8. Valialkin A. Prometheus vs VictoriaMetrics benchmark on node_exporter metrics. Medium. 2020. URL: https://valyala.medium.com/prometheus-vs-victoriametrics-benchmark-on-node-exporter-metrics-4ca29c75590f [accessed on 04.10.2023].
9. Dubrovin M.G. The concept of proactive monitoring and management of IT infrastructure objects. ITNOU: Informatsionnye tekhnologii v nauke, obrazovanii i upravlenii. 2020;(1):44–49. (In Russ.).
10. Gartner Ranking & Review of IT infrastructure monitoring tools. URL: https://www.gartner.com/reviews/market/it-infrastructure-monitoring-tools [accessed on 04.10.2023].
11. Brazil B. Prometheus: Up & Running: Infrastructure and Application Performance Monitoring. O'Reilly Media; 2018. 386 p.
12. X5 Group launches AI-powered IT landscape monitoring platform. Internet Media CNews. 2023 URL: https://www.cnews.ru/news/line/2023-02-13_x5_group_zapuskaet_platformu [accessed on 04.10.2023]. (In Russ.).
Keywords: monitoring system, time series, IT service, resource-service model, service management system, AIOps, big data
For citation: Kamenev A.S., Sakharov Y.S. AIOps system software module architecture for collecting and processing IT infrastructure time series. Modeling, Optimization and Information Technology. 2023;11(3). URL: https://moitvivt.ru/ru/journal/pdf?id=1409 DOI: 10.26102/2310-6018/2023.42.3.027 (In Russ).
Received 03.08.2023
Revised 29.08.2023
Accepted 27.09.2023
Published 30.09.2023