Finite Sample Analysis of LSTD with Random Projections and Eligibility Traces

Haifang Li; Yingce Xia; Wensheng Zhang

Finite Sample Analysis of LSTD with Random Projections and Eligibility Traces

Haifang Li, Yingce Xia, Wensheng Zhang

Proceedings of the Twenty-Seventh International Joint Conference on Artificial Intelligence

Main track. Pages 2390-2396. https://doi.org/10.24963/ijcai.2018/331

PDF BibTeX

Policy evaluation with linear function approximation is an important problem in reinforcement learning. When facing high-dimensional feature spaces, such a problem becomes extremely hard considering the computation efficiency and quality of approximations. We propose a new algorithm, LSTD(lambda)-RP, which leverages random projection techniques and takes eligibility traces into consideration to tackle the above two challenges. We carry out theoretical analysis of LSTD(lambda)-RP, and provide meaningful upper bounds of the estimation error, approximation error and total generalization error. These results demonstrate that LSTD(lambda)-RP can benefit from random projection and eligibility traces strategies, and LSTD(lambda)-RP can achieve better performances than prior LSTD-RP and LSTD(lambda) algorithms.

Keywords:

Machine Learning: Learning Theory

Machine Learning: Reinforcement Learning