Keywords: number series, anomalies, outliers, filtering, hampel
Quick search for anomalies in number series using the modified Hampel method
UDC 004.942 + 519.246.8
DOI: 10.26102/2310-6018/2023.43.4.030
The article discusses and formally introduces the concepts of a number series anomaly and an anomaly filter function. The relevance of the research is due to the absence of a unified approach to understanding the concept of anomaly. At the same time, they play a key role in solving many practical problems. The study uses a method for measuring the stability of the selected method of statistical assessment for outliers using breakdown points and sliding windows. The method of filtering a number series for outliers is based on a combination of the median and the median absolute deviation. In relation to solving a wide range of issues in IT automation а modification of the Hampel method is proposed for determining outliers in a sample. Functions for filtering a number series for anomalies and determining the index of the first anomalous element are developed in Python. As an example, a script was developed using the Jupyter Notebook platform to solve the problem of quick search for anomalies in stock prices by means of the modified Hampel method. To obtain a sample with outliers, the author's library is used to generate test stock data. The experimental results confirm that the proposed algorithms can clearly filter anomalies for different values of adjustable parameters. The advantages and disadvantages of this method are noted. The Hampel filter is easy to optimize and parallelize. The article has practical application for solving the problem of automation and identifying anomalies in number series.
Keywords: number series, anomalies, outliers, filtering, hampel
For citation: Gilmullin T.M., Gilmullin M.F. Quick search for anomalies in number series using the modified Hampel method. Modeling, Optimization and Information Technology. 2023;11(4). URL: https://moitvivt.ru/ru/journal/pdf?id=1482 DOI: 10.26102/2310-6018/2023.43.4.030 (In Russ).
Received 04.12.2023
Revised 08.12.2023
Accepted 20.12.2023
Published 31.12.2023