JOINT OPTIMIZATION FOR OBJECT DETECTION IN FOGGY WEATHER CONDITIONS

Trung-Hieu Le; Quoc-Viet Hoang; Minh-Quy Nguyen

Trung-Hieu Le National Taipei University of Technology (NTUT), Taiwan
Quoc-Viet Hoang Hung Yen University of Technology and Education
Minh-Quy Nguyen Hung Yen University of Technology and Education

Keywords: Object detection, DFO-Net, CNN, foggy images, defogging subnet, detection subnet

Abstract

Object detection using deep convolutional neural networks (CNN) has been widely studied and achieved impressive results in recent years. However, object detection in the presence of fog is far from solved because of poor visibility. In this paper, a novel CNN-based object detection model, named DFONet is introduced to address the problem of detecting objects in foggy weather conditions. DFO-Net is composed of two subnets including a defogging subnet and detection subnet. The defogging subnet is responsible for producing clean features from foggy images and sharing them with the detection subnet. The detection subnet uses these resulting features as the input and performs object classification and object localization. DFO-Net is trained end-to-end to jointly optimize visibility enhancement and object detection tasks. Experimental results on the FOD dataset indicate that the proposed DFO-Net obtained 48.85% mAP, outdoing many curent state-of-the-art object detection methods.

References

T. Lin, R. Girshick, and P. Doll, “Focal Loss for Dense Object Detection,” in International Conference on Computer Vision, 2017, pp. 2999–3007.

W. Liu et al., “SSD: Single shot multibox detector,” Lect. Notes Comput. Sci. (including Subser. Lect. Notes Artif. Intell. Lect. Notes Bioinformatics), 2016, vol. 9905 LNCS, pp. 21–37.

J. Redmon and A. Farhadi, “YOLOv3: An Incremental Improvement,” in CoRR, 2018, vol. abs/1804.0; p. http://arxiv.org/abs/1804.02767.

S. -C. Huang, T. -H. Le and D. -W. Jaw, “DSNet: Joint Semantic Learning for Object Detection in Inclement Weather Conditions,” in IEEE Transactions on Pattern Analysis and Machine Intelligence; doi: 10.1109/TPAMI.2020.2977911.

T. Le, S. Huang and D. Jaw, “Cross-Resolution Feature Fusion for Fast Hand Detection in Intelligent Homecare Systems,” in IEEE Sensors Journal, 15 June, 2019, vol. 19, no. 12, pp. 4696-4704.

Q. -V. Hoang, T. -H. Le and S. -C. Huang, “An Improvement of RetinaNet for Hand Detection in Intelligent Homecare Systems,” 2020 IEEE International Conference on Consumer Electronics - Taiwan (ICCE-Taiwan), Taoyuan, Taiwan, 2020, pp. 1-2.

Q. -V. Hoang, T. -H. Le and S. -C. Huang, “Data Augmentation for Improving SSD Performance in Rainy Weather Conditions,” 2020 IEEE International Conference on Consumer Electronics - Taiwan (ICCE-Taiwan), Taoyuan, Taiwan, 2020, pp. 1-2.

T. Le, D. Jaw, I. Lin, H. Liu and S. Huang, “An efficient hand detection method based on convolutional neural network,” 2018 7th International Symposium on Next Generation Electronics (ISNE), Taipei, 2018, pp. 1-2.

B. Li et al., “Benchmarking Single-Image Dehazing and Beyond,” IEEE Trans. Image Process., 2019, vol. 28, no. 1, pp. 492–505.

R. Girshick, J. Donahue, T. Darrell, and J. Malik, “Rich feature hierarchies for accurate object detection and semantic segmentation,” Proc. IEEE Comput. Soc. Conf. Comput. Vis. Pattern Recognit., 2014, pp. 580–587.

J. R. R. Uijlings, K. E. A. Van De Sande, T. Gevers, and A. W. M. Smeulders, “Selective search for object recognition,” Int. J. Comput. Vis., 2013, vol. 104, no. 2, pp. 154–171.

S. Ren, K. He, and R. Girshick, “Faster R-CNN : Towards Real-Time Object Detection with Region Proposal Networks,” in In Advances in neural information processing systems, 2015, pp. 91–99.

J. Redmon, S. Divvala, R. Girshick, and A. Farhadi, “You Only Look Once: Unified, Real-Time Object Detection,” in Conference on Computer Vision and Pattern Recognition, 2016, pp. 779–788.

A. F. Joseph Redmon, “(YOLOv2) YOLO9000:Better, Faster, Stronger Joseph,” in Conference on Computer Vision and Pattern Recognition, 2017, pp. 6517-6525.

I. H. Removal, B. Cai, X. Xu, K. Jia, and C. Qing, “DehazeNet : An End-to-End System for Single,”, 2016, vol. 25, no. 11, pp. 5187–5198.

H. Zhang, V. Sindagi, and V. M. Patel, “Multi-scale single image dehazing using perceptual pyramid deep network,” IEEE Comput. Soc. Conf. Comput. Vis. Pattern Recognit. Work., 2018, vol.2018-June, pp. 1015–1024.

B. Li, X. Peng, Z. Wang, J. Xu, and D. Feng, “AOD-Net : All-in-One Dehazing Network,” in International Conference on Computer Vision, 2017, pp. 4780–4788.

T. -H. Le, P. -H. Lin and S. -C. Huang, “LD-Net: An Efficient Lightweight Denoising Model Based on Convolutional Neural Network,” in IEEE Open Journal of the Computer Society, 2020, vol. 1, pp. 173-181.

K. He, X. Zhang, S. Ren, and J. Sun, “Deep Residual Learning for Image Recognition,” 2016 IEEE Conf. Comput. Vis. Pattern Recognit., 2016, pp. 770–778.

R. Girshick, “Fast R-CNN,” in proceedings of the IEEE international conference on computer vision, 2015, pp. 1440–1448.