Results 121 to 130 of about 257 (142)
Some of the next articles are maybe not open access.
On a restless multi-armed bandit problem with non-identical arms
2011 49th Annual Allerton Conference on Communication, Control, and Computing (Allerton), 2011We consider the following learning problem motivated by opportunistic spectrum access in cognitive radio networks. There are N independent Gilbert-Elliott channels with possibly non-identical transition matrices. It is desired to have an online policy to maximize the long-term expected discounted reward from accessing one channel at each time ...
Naumaan Nayyar +2 more
openaire +1 more source
Slow fading channel selection: A restless multi-armed bandit formulation
2012 International Symposium on Wireless Communication Systems (ISWCS), 2012We deal with a multi-access wireless network in which transmitters dynamically select a frequency band to communicate on. The slow fading channel attenuations follow an autoregressive model. In the single user case, we formulate this selection problem as a restless multi-armed bandit problem and we propose two strategies to dynamically select a band at
Avrachenkov, Konstantin +2 more
openaire +2 more sources
Logarithmic weak regret of non-Bayesian restless multi-armed bandit
2011 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), 2011We consider the restless multi-armed bandit (RMAB) problem with unknown dynamics. At each time, a player chooses K out of N (N > K) arms to play. The state of each arm determines the reward when the arm is played and transits according to Markovian rules no matter the arm is engaged or passive.
Haoyang Liu, Keqin Liu, Qing Zhao 0001
openaire +1 more source
Towards Zero Shot Learning in Restless Multi-armed Bandits
International Joint Conference on Autonomous Agents and Multiagent SystemsRestless multi-arm bandits (RMABs), a class of resource allocation problems with broad application in areas such as healthcare, online advertising, and anti-poaching, have recently been studied from a multi-agent reinforcement learning perspective. Prior RMAB research suffers from several limitations, e.g., it fails to adequately address continuous ...
Yunfan Zhao +7 more
openaire +2 more sources
Learning in Restless Multi-Armed Bandits using Adaptive Arm Sequencing Rules
2018 IEEE International Symposium on Information Theory (ISIT), 2018We consider a class of restless multi-armed bandit (RMAB) problems with unknown arm dynamics. At each time, a player chooses an arm out of $N$ arms to play, referred to as an active arm, and receives a random reward from a finite set of reward states. The reward state of the active arm transits according to an unknown Markovian dynamic.
Tomer Gafni, Kobi Cohen
openaire +1 more source
Time-Constrained Restless Multi-Armed Bandits with Applications to City Service Scheduling
International Joint Conference on Autonomous Agents and Multiagent SystemsMunicipalities maintain critical infrastructure through inspections, both proactive and in response to complaints. For example, the Chicago Department of Public Health (CDPH) periodically inspects 7000 food establishments to maintain the safety of food bought, sold, or prepared for public consumption. Restless multi-armed bandits (RMABs) appear to be a
Yi Mao, Andrew Perrault
openaire +2 more sources
Restless Multi-Armed Bandit in Opportunistic Scheduling
2021Kehao Wang, Lin Chen
openaire +1 more source
Application of multi-armed bandits to dose-finding clinical designs
Artificial Intelligence in Medicine, 2023Masahiro Kojima
exaly
The Perils of Misspecified Priors and Optional Stopping in Multi-Armed Bandits
Frontiers in Artificial Intelligence, 2021Markus Loecher
exaly
MAB-OS: Multi-Armed Bandits Metaheuristic Optimizer Selection
Applied Soft Computing Journal, 2022Kazem Meidani +2 more
exaly

