Playing Repeated Network Interdiction Games with Semi-Bandit Feedback

Playing Repeated Network Interdiction Games with Semi-Bandit Feedback

Qingyu Guo, Bo An, Long Tran-Thanh

Proceedings of the Twenty-Sixth International Joint Conference on Artificial Intelligence
Main track. Pages 3682-3690. https://doi.org/10.24963/ijcai.2017/515

We study repeated network interdiction games with no prior knowledge of the adversary and the environment, which can model many real world network security domains. Existing works often require plenty of available information for the defender and neglect the frequent interactions between both players, which are unrealistic and impractical, and thus, are not suitable for our settings. As such, we provide the first defender strategy, that enjoys nice theoretical and practical performance guarantees, by applying the adversarial online learning approach. In particular, we model the repeated network interdiction game with no prior knowledge as an online linear optimization problem, for which a novel and efficient online learning algorithm, SBGA, is proposed, which exploits the unique semi-bandit feedback in network security domains. We prove that SBGA achieves sublinear regret against adaptive adversary, compared with both the best fixed strategy in hindsight and a near optimal adaptive strategy. Extensive experiments also show that SBGA significantly outperforms existing approaches with fast convergence rate.
Keywords:
Multidisciplinary Topics and Applications: AI&Security and Privacy
Agent-based and Multi-agent Systems: Noncooperative Games