Keywords: deep learning, transformer, object detection, recognition of weather conditions on video, filtering of weather conditions, filtering of noise in the image, neural networks, technological operations
Features of applying deep learning methods to detect small objects in video in rainy conditions
UDC 004.896
DOI: 10.26102/2310-6018/2024.46.3.019
This paper discusses methods for detecting small objects in video when recognizing manual labor operations that take place outdoors, in the open air, and are affected by weather conditions. Approaches to improve the accuracy of detecting such objects in adverse weather conditions, such as rain, are considered. This paper explores a two-stage approach. At the first stage, computer vision methods and deep learning methods such as convolutional neural networks are used to identify and classify various weather conditions in video. At the second stage, when adverse weather conditions are detected, a study is conducted of various deep learning methods for filtering weather conditions in video. The main focus is on assessing the impact of various filtering methods on the accuracy of detecting small objects. The paper considers the applicability of this approach to detecting small tools in video data when recognizing manual labor operations performed during repair and maintenance of a railway track. The obtained results can be useful in the study of labor processes occurring outdoors, in algorithms for recognizing manual labor operations in video data.
1. Shtekhin S.E., Karachev D.K., Ivanova Yu.A. Computer vision system for Working time estimation by Human Activities detection in video frames. Trudy Instituta sistemnogo programmirovaniya RAN = Proceedings of the Institute for System Programming of the RAS. 2020;32(1):121–136. (In Russ.). https://doi.org/10.15514/ISPRAS-2020-32(1)-7
2. Zou Zh., Chen K., Shi Zh., Guo Yu., Ye J. Object Detection in 20 Years: A Survey. Proceedings of the IEEE. 2023;111(3):257–276. https://doi.org/10.1109/jproc.2023.3238524
3. Arkin E., Yadikar N., Xu X. et al. A survey: object detection methods from CNN to transformer. Multimedia Tools and Applications. 2023;82(14):21353–21383. https://doi.org/10.1007/s11042-022-13801-3
4. Karachev D.K., Shtekhin S.E., Tarasyan V.S., Smolin I.U., Isakov M.V. Style transfer as a way to improve the generalization ability of a neural network in an object detection task. Trudy Instituta sistemnogo programmirovaniya RAN = Proceedings of the Institute for System Programming of the RAS. 2023;35(6):247–264. (In Russ.). https://doi.org/10.15514/ISPRAS-2023-35(6)-16
5. Liu Y., Sun P., Wergeles N., Shang Y. A survey and performance evaluation of deep learning methods for small object detection. Expert Systems with Applications. 2021;172. https://doi.org/10.1016/j.eswa.2021.114602
6. Tsaruk V.B. Weather effects detection in video. In: Aktual'nye problemy aviatsii i kosmonavtiki: Sbornik materialov XIV Mezhdunarodnoi nauchno-prakticheskoi konferentsii, posvyashchennoi Dnyu kosmonavtiki: Volume 2, 09–13 April 2018, Krasnoyarsk, Russia. 2018. pp. 176–178. (In Russ.).
7. Lyakhov P.A., Ionisyan A.S., Liutova V.V., Orazaev A.R. Overview of methods for improving the visual quality of images and videos in adverse weather conditions. Sovremennaya nauka i innovatsii = Modern Science and Innovations. 2022;(4):8–24. (In Russ.). https://doi.org/10.37493/2307-910X.2022.4.1
8. Shtekhin S., Karachev D., Stadnik A. Study of Filtering the Weather Adverse Effects to Object Detection. Physics of Particles and Nuclei. 2024;55:329–333. https://doi.org/10.1134/S1063779624030766
9. Hnewa M., Radha H. Object Detection Under Rainy Conditions for Autonomous Vehicles: A Review of State-of-the-Art and Emerging Techniques. IEEE Signal Processing Magazine. 2021;38(1):53–67. https://doi.org/10.1109/MSP.2020.2984801
10. Deng J., Dong W., Socher R., Li L.-J., Li K., Li F.-F. ImageNet: A large-scale hierarchical image database. In: 2009 IEEE Conference on Computer Vision and Pattern Recognition, 20–25 June 2009, Miami, USA. IEEE; 2009. pp. 248–255. https://doi.org/10.1109/CVPR.2009.5206848
11. Gbeminiyi O., Zenghui W. Multi-Class Weather Classification from Still Image Using Said Ensemble Method. In: 2019 Southern African Universities Power Engineering Conference/Robotics and Mechatronics/Pattern Recognition Association of South Africa (SAUPEC/RobMech/PRASA), 28–30 January 2019, Bloemfontein, South Africa. IEEE; 2019. pp. 135–140. https://doi.org/10.1109/RoboMech.2019.8704783
12. He K., Zhang X., Ren S., Sun J. Deep Residual Learning for Image Recognition. In: 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 27–30 June 2016, Las Vegas, USA. IEEE; 2016. pp. 770–778. https://doi.org/10.1109/CVPR.2016.90
13. Chen X., Pan J., Dong J., Tang J. Towards Unified Deep Image Deraining: A Survey and A New Benchmark. URL: https://arxiv.org/pdf/2310.03535 [Accessed 30th July 2024].
14. Goodfellow I., Pouget-Abadie J., Mirza M., Xu B., Warde-Farley D., Ozair S., Courville A., Bengio Y. Generative adversarial networks. Communications of the ACM. 2020;63(11):139–144. https://doi.org/10.1145/3422622
15. Vaswani A., Shazeer N., Parmar N., Uszkoreit J., Jones L., Gomez A.N. et al. Attention is All you Need. In: Advances in Neural Information Processing Systems 30 (NIPS 2017): 31st Conference on Neural Information Processing Systems (NIPS 2017), 4–9 December 2017, Long Beach, USA. Montreal: Curran Associates; 2017. pp. 5998–6008.
16. Yang W., Tan R.T., Feng J., Liu J., Guo Z., Yan S. Deep Joint Rain Detection and Removal from a Single Image. In: 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 21–26 July 2017, Honolulu, USA. IEEE; 2017. pp. 1685–1694. https://doi.org/10.1109/CVPR.2017.183
17. Fu X., Huang J., Zeng D., Huang Y., Ding X., Paisley J. Removing Rain from Single Images via a Deep Detail Network. In: 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 21–26 July 2017, Honolulu, USA. IEEE; 2017. pp. 1715–1723. https://doi.org/10.1109/CVPR.2017.186
18. Li X., Wu J., Lin Z., Liu H., Zha H. Recurrent Squeeze-and-Excitation Context Aggregation Net for Single Image Deraining. In: Computer Vision – ECCV 2018: 15th European Conference: Proceedings: Part VII, 8–14 September 2018, Munich, Germany. Cham: Springer; 2018. pp. 262–277. https://doi.org/10.1007/978-3-030-01234-2_16
19. Fu X., Qi Q., Zha Z.-J., Zhu Y., Ding X. Rain Streak Removal via Dual Graph Convolutional Network. Proceedings of the AAAI Conference on Artificial Intelligence. 2021;35(2):1352–1360. https://doi.org/10.1609/aaai.v35i2.16224
20. Fu X., Xiao J., Zhu Y., Liu A., Wu F., Zha Z.-J. Continual Image Deraining With Hypergraph Convolutional Networks. IEEE Transactions on Pattern Analysis and Machine Intelligence. 2023;45(8):9534–9551. https://doi.org/10.1109/TPAMI.2023.3241756
21. Qian R., Tan R.T., Yang W., Su J., Liu J. Attentive Generative Adversarial Network for Raindrop Removal from A Single Image. In: 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition, 18–23 June 2018, Salt Lake City, USA. IEEE; 2018. pp. 2482–2491. https://doi.org/10.1109/CVPR.2018.00263
22. Zhang H., Sindagi V., Patel V.M. Image De-Raining Using a Conditional Generative Adversarial Network. IEEE Transactions on Circuits and Systems for Video Technology. 2019;30(11):3943–3956. https://doi.org/10.1109/TCSVT.2019.2920407
23. Li R., Cheong L.-F., Tan R.T. Heavy Rain Image Restoration: Integrating Physics Model and Conditional Adversarial Learning. In: 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 15–20 June 2019, Long Beach, USA. IEEE; 2019. pp. 1633–1642. https://doi.org/10.1109/CVPR.2019.00173
24. Pan J., Dong J., Liu Y., Zhang J., Ren J., Tang J. et al. Physics-Based Generative Adversarial Models for Image Restoration and Beyond. IEEE Transactions on Pattern Analysis and Machine Intelligence. 2020;43(7):2449–2462. https://doi.org/10.1109/TPAMI.2020.2969348
25. Ni S., Cao X., Yue T., Hu X. Controlling the Rain: from Removal to Rendering. In: 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 20–25 June 2021, Nashville, USA. IEEE; 2021. pp. 6324–6333. https://doi.org/10.1109/CVPR46437.2021.00626
26. Han K., Wang Y., Chen H., Chen X., Guo J., Liu Z. et al. A Survey on Vision Transformer. IEEE Transactions on Pattern Analysis and Machine Intelligence. 2022;45(1):87–110. https://doi.org/10.1109/TPAMI.2022.3152247
27. Xiao J., Fu X., Liu A., Wu F., Zha Z.-J. Image De-Raining Transformer. IEEE Transactions on Pattern Analysis and Machine Intelligence. 2022;45(11):12978–12995. https://doi.org/10.1109/TPAMI.2022.3183612
28. Chen H., Wang Y., Guo T., Xu C., Deng Y., Liu Z. et al. Pre-Trained Image Processing Transformer. In: 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 20–25 June 2021, Nashville, USA. IEEE; 2021. pp. 12294–12305. https://doi.org/10.1109/CVPR46437.2021.01212
29. Zamir S.W., Arora A., Khan S., Hayat M., Khan F.S., Yang M.-H. Restormer: Efficient Transformer for High-Resolution Image Restoration. In: 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 18–24 June 2022, New Orleans, USA. IEEE; 2022. pp. 5718–5729. https://doi.org/10.1109/CVPR52688.2022.00564
30. Jiang K., Wang Z., Chen C., Wang Z., Cui L., Lin C.-W. Magic ELF: Image Deraining Meets Association Learning and Transformer. In: MM '22: The 30th ACM International Conference on Multimedia, 10–14 October 2022, Lisboa, Portugal. New York: Association for Computing Machinery; 2022. pp. 827–836. https://doi.org/10.1145/3503161.3547760
31. Chen X., Pan J., Lu J., Fan Z., Li H. Hybrid CNN-Transformer Feature Fusion for Single Image Deraining. Proceedings of the AAAI Conference on Artificial Intelligence. 2023;37(1):378–386. https://doi.org/10.1609/aaai.v37i1.25111
32. Wang C.-Y., Bochkovskiy A., Liao H.-Y.M. YOLOv7: Trainable bag-of-freebies sets new state-of-the-art for real-time object detectors. URL: https://arxiv.org/abs/2207.02696v1 [Accessed 30th July 2024].
33. Howard A., Sandler M., Chen B., Wang W., Chen L.-C., Tan M., Chu G., Vasudevan V. Searching for MobileNetV3. In: 2019 IEEE/CVF International Conference on Computer Vision (ICCV), 27 October 2019 – 02 November 2019, Seoul, Korea (South). IEEE; 2019. pp. 1314–1324. https://doi.org/10.1109/ICCV.2019.00140
34. Simonyan K., Zisserman A. Very Deep Convolutional Networks for Large-Scale Image Recognition. URL: https://arxiv.org/abs/1409.1556 [Accessed 30th July 2024].
Keywords: deep learning, transformer, object detection, recognition of weather conditions on video, filtering of weather conditions, filtering of noise in the image, neural networks, technological operations
For citation: Shtekhin S.E. Stadnik A.V. Features of applying deep learning methods to detect small objects in video in rainy conditions. Modeling, Optimization and Information Technology. 2024;12(3). URL: https://moitvivt.ru/ru/journal/pdf?id=1640 DOI: 10.26102/2310-6018/2024.46.3.019 (In Russ).
Received 07.08.2024
Revised 19.08.2024
Accepted 28.08.2024
Published 30.09.2024