Results 21 to 30 of about 3,144 (205)
Multi-armed quantum bandits: Exploration versus exploitation when learning properties of quantum states [PDF]
We initiate the study of tradeoffs between exploration and exploitation in online learning of properties of quantum states. Given sequential oracle access to an unknown quantum state, in each round, we are tasked to choose an observable from a set of ...
Josep Lumbreras +2 more
doaj +1 more source
Multi-armed bandits with dependent arms
We study a variant of the classical multi-armed bandit problem (MABP) which we call as Multi-Armed Bandits with dependent arms. More specifically, multiple arms are grouped together to form a cluster, and the reward distributions of arms belonging to the same cluster are known functions of an unknown parameter that is a characteristic of the cluster ...
Rahul Singh 0001 +3 more
openaire +3 more sources
Multi-Armed Bandits and Quantum Channel Oracles [PDF]
Multi-armed bandits are one of the theoretical pillars of reinforcement learning. Recently, the investigation of quantum algorithms for multi-armed bandit problems was started, and it was found that a quadratic speed-up (in query complexity) is possible ...
Simon Buchholz +2 more
doaj +1 more source
Satisficing in Multi-Armed Bandit Problems [PDF]
To appear in IEEE Transactions on Automatic ...
Paul Reverdy +2 more
openaire +2 more sources
QoS-Aware Multi-armed Bandits [PDF]
Motivated by runtime verification of QoS requirements in self-adaptive and self-organizing systems that are able to reconfigure their structure and behavior in response to runtime data, we propose a QoS-aware variant of Thompson sampling for multi-armed bandits.
Lenz Belzner, Thomas Gabor
openaire +2 more sources
Budgeted Combinatorial Multi-Armed Bandits [PDF]
We consider a budgeted combinatorial multi-armed bandit setting where, in every round, the algorithm selects a super-arm consisting of one or more arms. The goal is to minimize the total expected regret after all rounds within a limited budget. Existing techniques in this literature either fix the budget per round or fix the number of arms pulled in ...
Debojit Das +2 more
openaire +2 more sources
Hedging using reinforcement learning: Contextual k-armed bandit versus Q-learning
The construction of replication strategies for contingent claims in the presence of risk and market friction is a key problem of financial engineering. In real markets, continuous replication, such as in the model of Black, Scholes and Merton (BSM), is ...
Loris Cannelli +3 more
doaj +1 more source
Multi-armed bandits for performance marketing
Abstract This paper deals with the problem of optimising bids and budgets of a set of digital advertising campaigns. We improve on the current state of the art by introducing support for multi-ad group marketing campaigns and developing a highly data efficient parametric contextual bandit.
Gigli, M, Stella, F
openaire +1 more source
On Kernelized Multi-armed Bandits
We consider the stochastic bandit problem with a continuous set of arms, with the expected reward function over the arms assumed to be fixed but unknown. We provide two new Gaussian process-based algorithms for continuous bandit optimization-Improved GP-UCB (IGP-UCB) and GP-Thomson sampling (GP-TS), and derive corresponding regret bounds. Specifically,
Sayak Ray Chowdhury, Aditya Gopalan
openaire +3 more sources
Correlation-Aware Collaborative Adaptive Window Algorithm for Multi-Armed Bandits [PDF]
The Multi-Armed Bandit (MAB) problem is central to reinforcement learning, where it addresses the trade-off between exploration and exploitation. However, traditional MAB algorithms often encounter difficulties in non-stationary environments with ...
Xu Yaoxin
doaj +1 more source

