Object Detection in Densely Packed Scenes via Semi-Supervised Learning with Dual Consistency

Object Detection in Densely Packed Scenes via Semi-Supervised Learning with Dual Consistency

Chao Ye, Huaidong Zhang, Xuemiao Xu, Weiwei Cai, Jing Qin, Kup-Sze Choi

Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence
Main Track. Pages 1245-1251. https://doi.org/10.24963/ijcai.2021/172

Deep neural networks have been shown to be very powerful tools for object detection in various scenes. Their remarkable performance, however, heavily depends on the availability of a large number of high quality labeled data, which are time-consuming and costly to acquire for scenes with densely packed objects. We present a novel semi-supervised approach to addressing this problem, which is designed based on a common teacher-student model, integrated with a novel intersection-over-union (IoU) aware consistency loss and a new proposal consistency loss. The IoU-aware consistency loss evaluates the IoU over the prediction pairs of the teacher model and the student model, which enforces the prediction of the student model to approach closely to that of the teacher model. The IoU-aware consistency loss also reweights the importance of different prediction pairs to suppress the low-confident pairs. The proposal consistency loss ensures proposal consistency between the two models, making it possible to involve the region proposal network in the training process with unlabeled data. We also construct a new dataset, namely RebarDSC, containing 2,125 rebar images annotated with 350,348 bounding boxes in total (164.9 annotations per image average), to evaluate the proposed method. Extensive experiments are conducted over both the RebarDSC dataset and the famous large public dataset SKU-110K. Experimental results corroborate that the proposed method is able to improve the object detection performance in densely packed scenes, consistently outperforming state-of-the-art approaches. Dataset is available in https://github.com/Armin1337/RebarDSC.
Keywords:
Computer Vision: Recognition: Detection, Categorization, Indexing, Matching, Retrieval, Semantic Interpretation
Machine Learning: Deep Learning
Machine Learning: Semi-Supervised Learning