Theoretical Investigation of Generalization Bound for Residual Networks

Theoretical Investigation of Generalization Bound for Residual Networks

Hao Chen, Zhanfeng Mo, Zhouwang Yang, Xiao Wang

Proceedings of the Twenty-Eighth International Joint Conference on Artificial Intelligence
Main track. Pages 2081-2087. https://doi.org/10.24963/ijcai.2019/288

This paper presents a framework for norm-based capacity control with respect to an lp,q-norm in weight-normalized Residual Neural Networks (ResNets). We first formulate the representation of each residual block. For the regression problem, we analyze the Rademacher Complexity of the ResNets family. We also establish a tighter generalization upper bound for weight-normalized ResNets. in a more general sight. Using the lp,q-norm weight normalization in which 1/p+1/q >=1, we discuss the properties of a width-independent capacity control, which only relies on the depth according to a square root term. Several comparisons suggest that our result is tighter than previous work. Parallel results for Deep Neural Networks (DNN) and Convolutional Neural Networks (CNN) are included by introducing the lp,q-norm weight normalization for DNN and the lp,q-norm kernel normalization for CNN. Numerical experiments also verify that ResNet structures contribute to better generalization properties.
Keywords:
Machine Learning: Deep Learning
Machine Learning: Probabilistic Machine Learning
Machine Learning Applications: Networks