Restless bandits - Open Access .click

Results 51 to 60 of about 147 (138)

When are Kalman-filter restless bandits indexable?

, 2015
To appear in NIPS ...
Christopher R. Dance, Tomi Silander
openaire +3 more sources

Restless bandits, partial conservation laws and indexability [PDF]

Advances in Applied Probability, 2000
We show that if performance measures in a general stochastic scheduling problem satisfy partial conservation laws (PCL), which extend the generalized conservation laws (GCL) introduced by Bertsimas and Niño-Mora (1996), then the problem is solved optimally by a priority-index policy under a range ofadmissiblelinear performance objectives, with both ...
openaire +4 more sources

Global Rewards in Restless Multi-Armed Bandits

Advances in Neural Information Processing Systems 37
Restless multi-armed bandits (RMAB) extend multi-armed bandits so pulling an arm impacts future states. Despite the success of RMABs, a key limiting assumption is the separability of rewards into a sum across arms. We address this deficiency by proposing restless-multi-armed bandit with global rewards (RMAB-G), a generalization of RMABs to global non ...
Naveen Raman 0001, Zheyuan Ryan Shi, Fei Fang 0001 +2 more
openaire +3 more sources

Characterization and Computation of Restless Bandit Marginal Productivity Indices [PDF]

Proceedings of the 2nd International ICST Conference on Performance Evaluation Methodologies and Tools, 2007
The Whittle index [P. Whittle (1988). Restless bandits: Activity allocation in a changing world. J. Appl. Probab. 25A, 287-298] yields a practical scheduling rule for the versatile yet intractable multi-armed restless bandit problem, involving the optimal dynamic priority allocation to multiple stochastic projects, modeled as restless bandits, i.e.
openaire +3 more sources

Multi-Armed Bandits in Brain-Computer Interfaces. [PDF]

Front Hum Neurosci, 2022
Heskebeck F, Bergeling C, Bernhardsson B. +2 more
europepmc +1 more source

Faster Q-Learning Algorithms for Restless Bandits

2024 IEEE 8th International Conference on Information and Communication Technology (CICT)
We study the Whittle index learning algorithm for restless multi-armed bandits (RMAB). We first present Q-learning algorithm and its variants -- speedy Q-learning (SQL), generalized speedy Q-learning (GSQL) and phase Q-learning (PhaseQL). We also discuss exploration policies -- $ε$-greedy and Upper confidence bound (UCB).
Parvish Kakarapalli +2 more
openaire +2 more sources

Relay Selection in Wireless Networks as Restless Bandits

IEEE Networking Letters
We consider a wireless network in which a source node needs to transmit a large file to a destination node. The direct wireless link between the source and the destination is assumed to be blocked. Multiple candidate relays are available to forward packets from the source to the destination.
Mandar R. Nalavade, Ravindra S. Tomar, Gaurav S. Kasbekar +2 more
openaire +2 more sources

Signal detection models as contextual bandits. [PDF]

R Soc Open Sci, 2023
Sherratt TN, O'Neill E.
europepmc +1 more source

Reliability of Decision-Making and Reinforcement Learning Computational Parameters. [PDF]

Comput Psychiatr, 2023
Mkrtchian A, Valton V, Roiser JP.
europepmc +1 more source

A Modularized Framework for Piecewise-Stationary Restless Bandits

CoRR
arXiv admin note: text overlap with arXiv:2410 ...
Kuan-Ta Li +3 more
openaire +2 more sources

fos: computer and information sciences
machine learning cs.lg
computer science - machine learning

fos: mathematics
markov and semi-markov decision processes
whittle index

index policies
optimization and control math.oc
indexability