## The variance of Exp3

In an earlier post we analyzed an algorithm called Exp3 for $k$-armed adversarial bandits for which the expected regret is bounded by \begin{align*} R_n = \max_{a \in [k]} \E\left[\sum_{t=1}^n y_{tA_t} – y_{ta}\right] \leq \sqrt{2n k \log(k)}\,. \end{align*} The setting of Continue Reading

## First order bounds for k-armed adversarial bandits

To revive the content on this blog a little we have decided to highlight some of the new topics covered in the book that we are excited about and that were not previously covered in the blog. In this post Continue Reading