Results 61 to 70 of about 257 (142)
On the asymptotic optimality of greedy index heuristics for multi-action restless bandits [PDF]
The class of restless bandits as proposed by Whittle (1988) have long been known to be intractable. This paper presents an optimality result which extends that of Weber and Weiss (1990) for restless bandits to a more general setting in which individual ...
Glazebrook, Kevin +2 more
core +1 more source
Two-stage index computation for bandits with switching penalties II : switching delays [PDF]
This paper addresses the multi-armed bandit problem with switching penalties including both costs and delays, extending results of the companion paper [J. Niño-Mora.
Jose Nino-Mora
core
Characterization and computation of restless bandit marginal productivity indices [PDF]
The Whittle index [P. Whittle (1988). Restless bandits: Activity allocation in a changing world. J. Appl. Probab. 25A, 287-298] yields a practical scheduling rule for the versatile yet intractable multi-armed restless bandit problem, involving the ...
Jose Nino-Mora
core
The problem of rested and restless multi-armed bandits with constrained availability (RMAB-CA) of arms is considered. The states of arms evolve in Markovian manner and the exact states are hidden from the decision maker. First, some structural results on
DESAI, UB +4 more
core +1 more source
Stochastic Rising Bandits [PDF]
This paper is in the field of stochastic Multi-Armed Bandits (MABs), i.e., those sequential selection techniques able to learn online using only the feedback given by the chosen option (a.k.a. arm).
Francesco Trovo' +3 more
core
A Hidden Markov Restless Multi-armed Bandit Model for Playout Recommendation Systems [PDF]
We consider a restless multi-armed bandit (RMAB) in which there are two types of arms, say A and B. Each arm can be in one of two states, say $0$ or $1.$ Playing a type A arm brings it to state $0$ with probability one and not playing it induces state transitions with arm-dependent probabilities.
Rahul Meshram +2 more
openaire +2 more sources
Two-stage index computation for bandits with switching penalties I : switching costs [PDF]
This paper addresses the multi-armed bandit problem with switching costs. Asawa and Teneketzis (1996) introduced an index that partly characterizes optimal policies, attaching to each bandit state a "continuation index" (its Gittins index) and a ...
Jose Nino-Mora
core
Online Restless Multi-Armed Bandits with Long-Term Fairness Constraints
Restless multi-armed bandits (RMAB) have been widely used to model sequential decision making problems with constraints. The decision maker (DM) aims to maximize the expected total reward over an infinite horizon under an “instantaneous activation constraint” that at most B arms can be activated at any decision epoch, where the state of each arm ...
Shufan Wang, Guojun Xiong, Jian Li
openaire +2 more sources
—The multi-armed bandit problem and one of its most interesting extensions, the restless bandits problem, are frequently encountered in various stochastic control problems.
Jerome Le Ny +5 more
core +1 more source
RESTLESS BANDIT MARGINAL PRODUCTIVITY INDICES II: MULTIPROJECT CASE AND SCHEDULING A MULTICLASS MAKE-TO-ORDER/-STOCK M/G/1 QUEUE [PDF]
This paper develops a framework based on convex optimization and economic ideas to formulate and solve approximately a rich class of dynamic and stochastic resource allocation problems, fitting in a generic discrete-state multi-project restless bandit ...
José Niño-Mora
core

