High Resolution Feature Recovering for Accelerating Urban Scene Parsing

High Resolution Feature Recovering for Accelerating Urban Scene Parsing

Rui Zhang, Sheng Tang, Luoqi Liu, Yongdong Zhang, Jintao Li, Shuicheng Yan

Proceedings of the Twenty-Seventh International Joint Conference on Artificial Intelligence
Main track. Pages 1156-1162. https://doi.org/10.24963/ijcai.2018/161

Both accuracy and speed are equally important in urban scene parsing. Most of the existing methods mainly focus on improving parsing accuracy, ignoring the problem of low inference speed due to large-sized input and high resolution feature maps. To tackle this issue, we propose a High Resolution Feature Recovering (HRFR) framework to accelerate a given parsing network. A Super-Resolution Recovering module is employed to recover features of large original-sized images from features of down-sampled input. Therefore, our framework can combine the advantages of (1) fast speed of networks with down-sampled input and (2) high accuracy of networks with large original-sized input. Additionally, we employ auxiliary intermediate supervision and boundary region re-weighting to facilitate the optimization of the network. Extensive experiments on the two challenging Cityscapes and CamVid datasets well demonstrate the effectiveness of the proposed HRFR framework, which can accelerate the scene parsing inference process by about 3.0x speedup from 1/2 down-sampled input with negligible accuracy reduction.
Keywords:
Computer Vision: Recognition: Detection, Categorization, Indexing, Matching, Retrieval, Semantic Interpretation
Computer Vision: Computer Vision