Fine-grained Analysis of Stability and Generalization for Stochastic Bilevel Optimization
Fine-grained Analysis of Stability and Generalization for Stochastic Bilevel Optimization
Xuelin Zhang, Hong Chen, Bin Gu, Tieliang Gong, Feng Zheng
Proceedings of the Thirty-Third International Joint Conference on Artificial Intelligence
Main Track. Pages 5508-5516.
https://doi.org/10.24963/ijcai.2024/609
Stochastic bilevel optimization (SBO) has been integrated into many machine learning paradigms recently including hyperparameter optimization, meta learning, reinforcement learning, etc. Along with the wide range of applications, there have been abundant studies on concerning the computing behaviors of SBO. However, the generalization guarantees of SBO methods are far less understood from the lens of statistical learning theory. In this paper, we provide a systematical generalization analysis of the first-order gradient-based bilevel optimization methods. Firstly, we establish the quantitative connections between the on-average argument stability and the generalization gap of SBO methods. Then, we derive the upper bounds of on-average argument stability for single timescale stochastic gradient descent (SGD) and two timescale SGD, where three settings (nonconvex-nonconvex (NC-NC), convex-convex (C-C) and strongly-convex-strongly-convex (SC-SC)) are considered respectively. Experimental analysis validates our theoretical findings.
Compared with the previous algorithmic stability analysis, our results do not require the re-initialization of the inner-level parameters before each iteration and are suit for more general objective functions.
Keywords:
Machine Learning: ML: Learning theory