Robust Stochastic Multi-Armed Bandits with Historical Data

Sarah Boufelja Yacobi; Djallel Bouneffouf

doi:10.1145/3543873.3587653

Publication

WWW 2023

Conference paper

Robust Stochastic Multi-Armed Bandits with Historical Data

WWW 2023

View publication

Abstract

We consider the problem of Stochastic Contextual Multi-Armed Bandits (CMABs) initialised with Historical data. Initialisation with historical data is an example of data-driven regularisation which should, in theory, accelerate the convergence of CMABs. However, in practice, we have little to no control over the underlying generation process of such data, which may exhibit some pathologies, possibly impeding the convergence and the stability of the algorithm. In this paper, we focus on two main challenges: bias selection and data corruption. We propose two new algorithms to solve these specific issues: LinUCB with historical data and offline balancing (OB-HLinUCB) and Robust LinUCB with corrupted historical data (R-HLinUCB). We derive their theoretical regret bounds and discuss their computational performance using real-world datasets.

Date

30 Apr 2023

Publication

WWW 2023

Authors

IBM-affiliated at time of publication

Abstract

Date

Publication

Authors

Share