Results 51 to 60 of about 147 (138)
When are Kalman-filter restless bandits indexable?
To appear in NIPS ...
Christopher R. Dance, Tomi Silander
openaire +3 more sources
Restless bandits, partial conservation laws and indexability [PDF]
We show that if performance measures in a general stochastic scheduling problem satisfy partial conservation laws (PCL), which extend the generalized conservation laws (GCL) introduced by Bertsimas and Niño-Mora (1996), then the problem is solved optimally by a priority-index policy under a range ofadmissiblelinear performance objectives, with both ...
openaire +4 more sources
Global Rewards in Restless Multi-Armed Bandits
Restless multi-armed bandits (RMAB) extend multi-armed bandits so pulling an arm impacts future states. Despite the success of RMABs, a key limiting assumption is the separability of rewards into a sum across arms. We address this deficiency by proposing restless-multi-armed bandit with global rewards (RMAB-G), a generalization of RMABs to global non ...
Naveen Raman 0001 +2 more
openaire +3 more sources
Characterization and Computation of Restless Bandit Marginal Productivity Indices [PDF]
The Whittle index [P. Whittle (1988). Restless bandits: Activity allocation in a changing world. J. Appl. Probab. 25A, 287-298] yields a practical scheduling rule for the versatile yet intractable multi-armed restless bandit problem, involving the optimal dynamic priority allocation to multiple stochastic projects, modeled as restless bandits, i.e.
openaire +3 more sources
Multi-Armed Bandits in Brain-Computer Interfaces. [PDF]
Heskebeck F +2 more
europepmc +1 more source
Faster Q-Learning Algorithms for Restless Bandits
We study the Whittle index learning algorithm for restless multi-armed bandits (RMAB). We first present Q-learning algorithm and its variants -- speedy Q-learning (SQL), generalized speedy Q-learning (GSQL) and phase Q-learning (PhaseQL). We also discuss exploration policies -- $ε$-greedy and Upper confidence bound (UCB).
Parvish Kakarapalli +2 more
openaire +2 more sources
Relay Selection in Wireless Networks as Restless Bandits
We consider a wireless network in which a source node needs to transmit a large file to a destination node. The direct wireless link between the source and the destination is assumed to be blocked. Multiple candidate relays are available to forward packets from the source to the destination.
Mandar R. Nalavade +2 more
openaire +2 more sources
Signal detection models as contextual bandits. [PDF]
Sherratt TN, O'Neill E.
europepmc +1 more source
Reliability of Decision-Making and Reinforcement Learning Computational Parameters. [PDF]
Mkrtchian A, Valton V, Roiser JP.
europepmc +1 more source
A Modularized Framework for Piecewise-Stationary Restless Bandits
arXiv admin note: text overlap with arXiv:2410 ...
Kuan-Ta Li +3 more
openaire +2 more sources

