Bandits Dueling on Partially Ordered Sets

Abstract : We address the problem of dueling bandits defined on partially ordered sets, or posets. In this setting, arms may not be comparable, and there may be several (incomparable) optimal arms. We propose an algorithm, UnchainedBandits, that efficiently finds the set of optimal arms —the Pareto front— of any poset even when pairs of comparable arms cannot be a priori distinguished from pairs of incomparable arms, with a set of minimal assumptions. This means that Un-chainedBandits does not require information about comparability and can be used with limited knowledge of the poset. To achieve this, the algorithm relies on the concept of decoys, which stems from social psychology. We also provide theoretical guarantees on both the regret incurred and the number of comparison required by UnchainedBandits, and we report compelling empirical results.
Complete list of metadatas

Cited literature [22 references]  Display  Hide  Download

https://hal.archives-ouvertes.fr/hal-01774844
Contributor : Liva Ralaivola <>
Submitted on : Wednesday, May 16, 2018 - 7:27:51 PM
Last modification on : Thursday, April 4, 2019 - 10:18:05 AM
Long-term archiving on : Tuesday, September 25, 2018 - 4:43:06 PM

File

6808-bandits-dueling-on-partia...
Files produced by the author(s)

Identifiers

  • HAL Id : hal-01774844, version 1

Collections

Citation

Julien Audiffren, Liva Ralaivola. Bandits Dueling on Partially Ordered Sets. Neural Information Processing Systems, NIPS 2017, Dec 2017, Long Beach, CA, United States. ⟨hal-01774844⟩

Share

Metrics

Record views

70

Files downloads

35