A sensing policy based on confidence bounds and a restless multi-armed bandit model [PDF]
In proceedings of the 46th Asilomar conference ...
Jan Oksanen +2 more
openaire +3 more sources
The non-Bayesian restless multi-armed bandit: A case of near-logarithmic regret [PDF]
In the classic Bayesian restless multi-armed bandit (RMAB) problem, there are $N$ arms, with rewards on all arms evolving at each time as Markov chains with known parameters. A player seeks to activate $K \geq 1$ arms at each time in order to maximize the expected total reward obtained over multiple plays. RMAB is a challenging problem that is known to
Wenhan Dai +3 more
openaire +2 more sources
On the Whittle Index for Restless Multiarmed Hidden Markov Bandits [PDF]
We consider a restless multi-armed bandit in which each arm can be in one of two states. When an arm is sampled, the state of the arm is not available to the sampler. Instead, a binary signal with a known randomness that depends on the state of the arm is available. No signal is available if the arm is not sampled.
Rahul Meshram +2 more
openaire +5 more sources
Dynamic resource allocation in a multi-product make-to-stock production system [PDF]
We consider optimal policies for a production facility in which several (K) products are made to stock in order to satisfy exogenous demand for each. The single machine version of this problem in which the facility manufactures at most one product at a ...
K. D. Glazebrook +3 more
core +1 more source
Equitable Restless Multi-Armed Bandits: A General Framework Inspired By Digital Health
Restless multi-armed bandits (RMABs) are a popular framework for algorithmic decision making in sequential settings with limited resources. RMABs are increasingly being used for sensitive decisions such as in public health, treatment scheduling, anti-poaching, and -- the motivation for this work -- digital health.
Jackson A. Killian +5 more
openaire +2 more sources
Fairness of Exposure in Online Restless Multi-armed Bandits
Restless multi-armed bandits (RMABs) generalize the multi-armed bandits where each arm exhibits Markovian behavior and transitions according to their transition dynamics. Solutions to RMAB exist for both offline and online cases. However, they do not consider the distribution of pulls among the arms.
Archit Sood +2 more
openaire +3 more sources
An Asymptotically Optimal Heuristic for General Non-Stationary Finite-Horizon Restless Multi-Armed Multi-Action Bandits [PDF]
We propose an asymptotically optimal heuristic, which we termed the Randomized Assignment Control (RAC) for restless multi-armed bandit problems with discrete-time and fi nite states.
Zayas-Caban, Gabriel +5 more
core +1 more source
This paper designs and evaluates novel centralized sampling and scheduling policies for minimizing the average age of incorrect information (AoII) for a multi-hop network with interference constraints.
Nibin Raj, Vineeth Bala Sukumaran
doaj +1 more source
An index heuristic for transshipment decisions in multi-location inventory systems based on a pairwise decomposition [PDF]
In multi-location inventory systems, transshipments are often used to improve customer service and reduce cost. Determining optimal transshipment policies for such systems involves a complex optimisation problem that is only tractable for systems with ...
Archibald, T.; id_orcid +5 more
core +1 more source
Optimistic Whittle Index Policy: Online Learning for Restless Bandits
Restless multi-armed bandits (RMABs) extend multi-armed bandits to allow for stateful arms, where the state of each arm evolves restlessly with different transitions depending on whether that arm is pulled.
Taneja, Aparna +3 more
core +1 more source

