We mentioned at some point that these notes would become a book. We’re happy to say this project is coming close to completion. We hope in the next month or so to publish a reasonable quality draft.
Of course the book contains all the content in the blog in a polished and extended form. There are also many new chapters. Some highlights are: combinatorial bandits, non-stationary bandits, ranking, Bayesian methods (including Thompson sampling) and pure exploration. We also have two chapters that peek beyond the world of bandits at partial monitoring and learning in Markov decision processes.
Once the draft is complete we would love to have your feedback.