Keywords: object, class, knowledge base, emissions, informative weight
CONSTRUCTION OF A LOGICAL ALGORITHM FOR DETECTING EMISSIONS INTO A DISTURBABLE DATA
UDC 519.7
DOI: 10.26102/2310-6018/2018.23.4.011
The paper proposes a logical approach to data quality analysis for solving machine-learning problems. When developing machine-learning algorithms, a part of the initial data of the problem being solved is combined into a training sample. As a rule, the quality of this data is not ideal, and this is a rather acute problem arising in the construction of training recognition systems. Since the construction of the recognition model is the result of the sequential presentation of the initial data set, their incorrectness can significantly distort the final model, which stresses the results of the recognition algorithms. The data that introduce distortions in building a model is called outliers. The cause of emissions is the interference of the equipment, incorrect interpretation of the expert, noise, etc. In this regard, the task of analyzing data to identify emissions and reducing their influence on the process of formation (training) of the working model arises. At the same time, it is important to separate the individual features of recognized objects from abnormal data. In the present work, logical methods of data analysis are proposed, allowing data to be classified. As a classifier function, a function is constructed that is a logical combination of production rules. It solves a number of problems, builds all possible classes, reveals the individual characteristics of objects included in the data set, identifies objects and their signs that are grown. Based on the results of the constructed classifier, the identified suspicious objects can be additionally investigated for belonging to a set of emissions, taking into account the obtained estimate. The proposed approach allows not only to make a training sample for classes, but also to identify emissions, objects that can not act as standards of the training sample. The method proposed in this paper can serve as the basis for constructing a procedure that enhances the informative quality of a training sample in the pre-project area under study.
1. D'yakonov A.G., Golovina A.M. Vyyavlenie anomaliy v rabote mekhanizmov metodami mashinnogo obucheniya // Analitika i upravlenie dannymi v oblastyakh s intensivnym ispol'zovaniem dannykh: trudy XIX Mezhdunarodnoy konferentsii DAMDID/RCDL'2017, 2017. pp. 469–476.
2. Zhuravlev Yu. I. Ob algebraicheskom podkhode k resheniyu zadach raspoznavaniya ili klassifikatsii // Problemy kibernetiki. 1978. Vol. 33. pp. 5–68.
3. Lyutikova L. A., Shmatova E. V. Analiz i sintez algoritmov raspoznavaniya obrazov s ispol'zovaniem peremenno-znachnoy logiki // Informatsionnye tekhnologii. No.4. Vol. 22. 2016. pp. 292—297.
4. Lyutikova L.A., Shmatova E.V. Logicheskiy podkhod k korrektsii rezul'tatov raboty $\Sigma\Pi$-neyronnykh setey // Informatsionnye tekhnologii. 2018. Vol. 24. No.2. pp. 110-116.
5. Shibzukhov Z.M. O printsipe minimizatsii empiricheskogo riska na osnove usrednyayushchikh agregiruyushchikh funktsiy. // Doklady RAN. 2017. Vol.476. No.5. pp. 495-499.
6. Flakh P. Mashinnoe obuchenie. Naukai isskustvo postroeniya algoritmov, kotorye izvlekayut znaniya iz dannykh. M.: MDK Press, 2015. 400 p.
Keywords: object, class, knowledge base, emissions, informative weight
For citation: Lutikova L.A. CONSTRUCTION OF A LOGICAL ALGORITHM FOR DETECTING EMISSIONS INTO A DISTURBABLE DATA. Modeling, Optimization and Information Technology. 2018;6(4). URL: https://moit.vivt.ru/wp-content/uploads/2018/10/Lyutikova_4_18_1.pdf DOI: 10.26102/2310-6018/2018.23.4.011 (In Russ).
Published 31.12.2018