Method of training image classifiers using additional labels

idPetrova I.S.

UDC 004.93'11
DOI: 10.26102/2310-6018/2025.49.2.041

Abstract
List of references
About authors

This paper is devoted to the development of a method for training classifiers that takes into account relationships between classes, represented as additional labels. The loss functions used in classification and the approaches to incorporating additional labels into them were analyzed. Based on this analysis, we propose as the foundation of our method a triplet loss with a flexible margin, designed on the basis of the original triplet loss. The flexible margin allows adjusting the distances between the embeddings of images depending on the difference degree between their corresponding classes. This makes it possible to model different levels of similarity between classes: category, group, and subgroup levels. In addition, we develop a triplet mining strategy that prevents the model’s weights from collapsing to zero and getting stuck in a trivial solution. The method is validated on tasks of product classification and gastrointestinal disease classification. As a result of applying the method, classification accuracy increased by 9 % in the disease recognition task and by 6 % in the product recognition task. The number of severe classification errors was reduced. The image embedding space formed by the triplet loss allows clustering and recognition of new classes without additional model training.

1. Evstratkin K.S., Sultanova A.R., Erpelev A.V. OPENCV: varianty ispol'zovaniya komp'yuternogo zreniya. In: Tsifrovye tekhnologii: nauka, obrazovanie, innovatsii: materialy III Mezhdunarodnogo nauchnogo Foruma professorsko-prepodavatel'skogo sostava i molodykh uchenykh, 09 November 2020, Moscow, Russia. Moscow: Moscow State University of Technology "STANKIN"; 2021. P. 28–31. (In Russ.).

2. Lobzin I.A. Issledovanie vozmozhnostei sistemy II ChatGPT po resheniyu zadachi klassifikatsii. In: Potentsial ustoichivogo innovatsionnogo razvitiya: kontseptsii, modeli i prakticheskoe prilozhenie: sbornik statei po itogam Mezhdunarodnoi nauchno-prakticheskoi konferentsii, 15 June 2023, Perm, Russia. Ufa: Agentstvo mezhdunarodnykh issledovanii; 2023. P. 157–162. (In Russ.).

3. Moayeri M., Pope Ph., Balaji Yo., Feizi S. A Comprehensive Study of Image Classification Model Sensitivity to Foregrounds, Backgrounds, and Visual Attributes. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 18–24 June 2022, New Orleans, LA, USA. IEEE; 2022. P. 19065–19075. https://doi.org/10.1109/CVPR52688.2022.01850

4. Elhamod M., Diamond K.M., Maga A.M., et al. Hierarchy‐Guided Neural Network for Species Classification. Methods in Ecology and Evolution. 2021;13(1):642–652. https://doi.org/10.1111/2041-210X.13768

5. Ivanova G.S., Petrova Y.S. Analysis of Computer Vision Methods for Painting Systematization. Neurocomputers. 2022;24(6):20–29. (In Russ.). https://doi.org/10.18127/j19998554-202206-02

6. Wang Yi., Liu P., Lang Yi., Zhou Q., Shan X. Learnable Dynamic Margin in Deep Metric Learning. Pattern Recognition. 2022;132. https://doi.org/10.1016/j.patcog.2022.108961

7. Sun Yi., Zhu Yu., Zhang Yu., et al. Dynamic Metric Learning: Towards a Scalable Metric Space to Accommodate Multiple Semantic Scales. In: IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2021, 19–25 June 2021, Online. IEEE; 2021. P. 5393–5402. https://doi.org/10.1109/CVPR46437.2021.00535

8. Suloev K.K., Sheshkus A.V., Arlazarov V.L. Spherical Constraints in the Triplet Loss Function. Proceedings of the Institute for Systems Analysis Russian Academy of Sciences. 2023;73(2):50–58. (In Russ.). https://doi.org/10.14357/20790279230205

9. Nguyen Kh., Nguyen H.H., Tiulpin A. AdaTriplet: Adaptive Gradient Triplet Loss with Automatic Margin Learning for Forensic Medical Image Matching. In: Medical Image Computing and Computer Assisted Intervention – MICCAI 2022: 25th International Conference: Proceedings: Part VIII, 18–22 September 2022, Singapore. Cham: Springer; 2022. P. 725–735. https://doi.org/10.48550/arXiv.2205.02849

10. Wang Zh., Wang Yi., Dong B., Pracheta S., Hamlen K., Khan L. Adaptive Margin Based Deep Adversarial Metric Learning. In: 2020 IEEE 6th International Conference on Big Data Security on Cloud (BigDataSecurity), IEEE International Conference on High Performance and Smart Computing, (HPSC) and IEEE International Conference on Intelligent Data and Security (IDS), 25–27 May 2020, Baltimore, MD, USA. IEEE; 2020. P. 100–108. https://doi.org/10.1109/BigDataSecurity-HPSC-IDS49724.2020.00028

11. Zakharov S., Kehl W., Planche B., Hutter A., Ilic S. 3D Object Instance Recognition and Pose Estimation Using Triplet Loss With Dynamic Margin. In: 2017 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), 24–28 September 2017, Vancouver, BC, Canada. IEEE; 2017. P. 552–559. https://doi.org/10.1109/IROS.2017.8202207

12. Shaik S., Bucher B., Agrafiotis N., Phillips S., Daniilidis K., Schmenner W. Learning Portrait Style Representations. arXiv. URL: https://doi.org/10.48550/arXiv.2012.04153 [Accessed 3rd May 2025].

13. Novikova K.N. Development Trends of Object Classification Methods Based on Modifications Contrastive Learning. In: Khimiya i inzhenernaya ekologiya – XXIII: sbornik trudov mezhdunarodnoi nauchnoi konferentsii (shkola molodykh uchenykh), posvyashchennoi sotrudnichestvu s soyuznymi gosudarstvami, 25–26 September 2023, Kazan, Russia. IP Sagiev A.R.; 2023. P. 200–205. (In Russ.).

14. Rogachev N.E. Problema kollapsa neironnykh setei pri ispol'zovanii Triplet Loss. In: Veb-programmirovanie i internet-tekhnologii WebConf2018: tezisy dokladov 4-i Mezhdunarodnoi nauchno-prakticheskoi konferentsii, 14–18 May 2018, Minsk, Belarus. Minsk: Belarusian State University; 2018. P. 92. (In Russ.).

15. Abdullah T., Bazi Ya., Al Rahhal M.M., Mekhalfi M.L., Rangarajan L., Zuair M. TextRS: Deep Bidirectional Triplet Network for Matching Text to Remote Sensing Images. Remote Sensing. 2020;12(3). https://doi.org/10.3390/rs12030405

16. Borgli H., Thambawita V., Smedsrud P.H., et al. HyperKvasir, a Comprehensive Multi-Class Image and Video Dataset for Gastrointestinal Endoscopy. Scientific Data. 2020;7. https://doi.org/10.1038/s41597-020-00622-y

17. Petrova I.S., Ivanova G.S. Comparison of Feature Extraction Models from Images with Several Annotations. Mathematical Methods in Technologies and Technics. 2023;(2):71–74. (In Russ.).

Petrova Iana Sergeevna

Scopus | ORCID | eLibrary |

Bauman Moscow State Technical University

Moscow, Russian Federation

Keywords: loss function, classification, computer vision, triplets, labels, vector space

For citation: Petrova I.S. Method of training image classifiers using additional labels. Modeling, Optimization and Information Technology. 2025;13(2). URL: https://moitvivt.ru/ru/journal/pdf?id=1928 DOI: 10.26102/2310-6018/2025.49.2.041 (In Russ).

106

Full text in PDF

Received 27.04.2025

Revised 25.05.2025

Accepted 07.06.2025

Published 30.06.2025