Using machine learning methods to predict the detection of crimes based on primary accounting documents

idBulgakov D.Y.

UDC 004.8
DOI: 10.26102/2310-6018/2021.33.2.030

The result of the solving of crimes is one of the important indicators of the activities of law enforcement agencies. Despite the improvement of crime investigation methods, the success rate of crime detection in the Russian Federation remains at the level of 51%–56%. The article describes a method for constructing a mathematical model – a digital double of a registered crime. As the initial data for constructing the model, an array of information – primary accounting documents, about 341 thousand crimes committed on the territory of the Primorsky Krai over 11 years-from 2010 to 2020. The model allows you: with 88% confidence, based on the formalized primary information contained in the primary accounting documents – statistical cards Form No. 1 “On the detected crime”, to make a forecast about whether the crime will be solved or not; to audit unsolved crimes of previous years in order to determine the crimes that have a high probability of detection; to identify the features in the statistical cards that most affect the forecast of the detection of crimes. The model is based on the use of machine learning algorithms “gradient boosting over decision trees”, implemented in the open library of artificial intelligence CatBoost from Yandex. The accuracy of the model is confirmed by the preparation and verification of the forecast of the result of the investigation of crimes in January–June 2021 for 16408 crimes committed on the territory of the Primorsky Krai.

Bulgakov Dmitry Yurevich

Email: dbulgakov7@yandex.ru

ORCID | eLibrary |

Federal government institution “The Main Informational Analytic Centre of the Ministry of Internal Affairs of the Russian Federation”
Management Academy of the Ministry of the Interior of Russia

Moscow, Russian Federation

Keywords: digital double, predictive model, crime, statistical cards, machine learning, artificial intelligence, catBoost, gradient boosting, decision trees, feature importance

For citation: Bulgakov D.Y. Using machine learning methods to predict the detection of crimes based on primary accounting documents. Modeling, Optimization and Information Technology. 2021;9(2). Available from: https://moitvivt.ru/ru/journal/pdf?id=1010 DOI: 10.26102/2310-6018/2021.33.2.030 (In Russ).


Accepted 11.08.2021

Published 15.08.2021