HCR-Net: A Hybrid of Classification and Regression Network for Object Pose Estimation

HCR-Net: A Hybrid of Classification and Regression Network for Object Pose Estimation

Zairan Wang, Weiming Li, Yueying Kao, Dongqing Zou, Qiang Wang, Minsu Ahn, Sunghoon Hong

Proceedings of the Twenty-Seventh International Joint Conference on Artificial Intelligence
Main track. Pages 1014-1020. https://doi.org/10.24963/ijcai.2018/141

Object pose estimation from a single image is a fundamental and challenging problem in computer vision and robotics. Generally, current methods treat pose estimation as a classification or a regression problem. However, regression based methods usually suffer from the issue of imbalanced training data, while classification methods are difficult to discriminate nearby poses. In this paper, a hybrid CNN model, which we call it HCR-Net that integrates both a classification network and a regression network, is proposed to deal with these issues. Our model is inspired by that regression methods can get better accuracy on homogeneously distributed datasets while classification methods are more effective for coarse quantization of the poses even if the dataset is not well balanced. The classification methods and the regression methods essentially complement each other. Thus we integrate both them into a neural network in a hybrid fashion and train it end-to-end with two novel loss functions. As a result, our method surpass the state-of-the-art methods, even with imbalanced training data and  much less data augmentation. The experimental results on the challenging Pascal3D+ database demonstrate that our method outperforms the state-of-the-arts significantly, achieving improvements on ACC and AVP metrics up to 4% and 6%, respectively.
Keywords:
Computer Vision: 2D and 3D Computer Vision
Computer Vision: Perception
Computer Vision: Computer Vision