Multi-armed bandits - Open Access .click

Results 21 to 30 of about 3,144 (205)

Multi-armed quantum bandits: Exploration versus exploitation when learning properties of quantum states [PDF]

Quantum, 2022
We initiate the study of tradeoffs between exploration and exploitation in online learning of properties of quantum states. Given sequential oracle access to an unknown quantum state, in each round, we are tasked to choose an observable from a set of ...
Josep Lumbreras, Erkka Haapasalo, Marco Tomamichel +2 more
doaj +1 more source

Multi-armed bandits with dependent arms

Machine Learning, 2023
We study a variant of the classical multi-armed bandit problem (MABP) which we call as Multi-Armed Bandits with dependent arms. More specifically, multiple arms are grouped together to form a cluster, and the reward distributions of arms belonging to the same cluster are known functions of an unknown parameter that is a characteristic of the cluster ...
Rahul Singh 0001 +3 more
openaire +3 more sources

Multi-Armed Bandits and Quantum Channel Oracles [PDF]

Quantum
Multi-armed bandits are one of the theoretical pillars of reinforcement learning. Recently, the investigation of quantum algorithms for multi-armed bandit problems was started, and it was found that a quadratic speed-up (in query complexity) is possible ...
Simon Buchholz, Jonas M. Kübler, Bernhard Schölkopf +2 more
doaj +1 more source

Satisficing in Multi-Armed Bandit Problems [PDF]

IEEE Transactions on Automatic Control, 2017
To appear in IEEE Transactions on Automatic ...
Paul Reverdy +2 more
openaire +2 more sources

QoS-Aware Multi-armed Bandits [PDF]

2016 IEEE 1st International Workshops on Foundations and Applications of Self* Systems (FAS*W), 2016
Motivated by runtime verification of QoS requirements in self-adaptive and self-organizing systems that are able to reconfigure their structure and behavior in response to runtime data, we propose a QoS-aware variant of Thompson sampling for multi-armed bandits.
Lenz Belzner, Thomas Gabor
openaire +2 more sources

Budgeted Combinatorial Multi-Armed Bandits [PDF]

International Joint Conference on Autonomous Agents and Multiagent Systems, 2022
We consider a budgeted combinatorial multi-armed bandit setting where, in every round, the algorithm selects a super-arm consisting of one or more arms. The goal is to minimize the total expected regret after all rounds within a limited budget. Existing techniques in this literature either fix the budget per round or fix the number of arms pulled in ...
Debojit Das, Shweta Jain 0002, Sujit Gujar +2 more
openaire +2 more sources

Hedging using reinforcement learning: Contextual k-armed bandit versus Q-learning

Journal of Finance and Data Science, 2023
The construction of replication strategies for contingent claims in the presence of risk and market friction is a key problem of financial engineering. In real markets, continuous replication, such as in the model of Black, Scholes and Merton (BSM), is ...
Loris Cannelli +3 more
doaj +1 more source

Multi-armed bandits for performance marketing

International Journal of Data Science and Analytics, 2023
Abstract This paper deals with the problem of optimising bids and budgets of a set of digital advertising campaigns. We improve on the current state of the art by introducing support for multi-ad group marketing campaigns and developing a highly data efficient parametric contextual bandit.
Gigli, M, Stella, F
openaire +1 more source

On Kernelized Multi-armed Bandits

CoRR, 2017
We consider the stochastic bandit problem with a continuous set of arms, with the expected reward function over the arms assumed to be fixed but unknown. We provide two new Gaussian process-based algorithms for continuous bandit optimization-Improved GP-UCB (IGP-UCB) and GP-Thomson sampling (GP-TS), and derive corresponding regret bounds. Specifically,
Sayak Ray Chowdhury, Aditya Gopalan
openaire +3 more sources

Correlation-Aware Collaborative Adaptive Window Algorithm for Multi-Armed Bandits [PDF]

ITM Web of Conferences
The Multi-Armed Bandit (MAB) problem is central to reinforcement learning, where it addresses the trade-off between exploration and exploitation. However, traditional MAB algorithms often encounter difficulties in non-stationary environments with ...
Xu Yaoxin
doaj +1 more source

computer science
mathematics
fos: computer and information sciences

artificial intelligence
mathematical optimization
machine learning cs.lg

computer science - machine learning
machine learning
regret