Results 71 to 80 of about 257 (142)
We introduce robustness in \textit{restless multi-armed bandits} (RMABs), a popular model for constrained resource allocation among independent stochastic processes (arms).
Biswas, Arpita +3 more
core
Characterization and computation of restless bandit marginal productivity indices [PDF]
The Whittle index [P. Whittle (1988). Restless bandits: Activity allocation in a changing world. J. Appl. Probab. 25A, 287-298] yields a practical scheduling rule for the versatile yet intractable multi-armed restless bandit problem, involving the ...
Niño-Mora, José +2 more
core +1 more source
Multi-Armed Bandits in Brain-Computer Interfaces. [PDF]
Heskebeck F +2 more
europepmc +1 more source
Optimal Control of Fluid Restless Multi-armed Bandits: A Machine Learning Approach
We present a novel machine learning framework for the optimal control of fluid restless multi-armed bandit problems (FRMABPs) with state equations that are either affine or quadratic in the state variables. By establishing fundamental properties of FRMABPs, we develop an efficient numerical algorithm that generates a comprehensive training set by ...
Dimitris Bertsimas +2 more
openaire +2 more sources
Signal detection models as contextual bandits. [PDF]
Sherratt TN, O'Neill E.
europepmc +1 more source
Minimizing Cost Rather Than Maximizing Reward in Restless Multi-Armed Bandits
Restless Multi-Armed Bandits (RMABs) offer a powerful framework for solving resource constrained maximization problems. However, the formulation can be inappropriate for settings where the limiting constraint is a reward threshold rather than a budget.
R. Teal Witter, Lisa Hellerstein
openaire +2 more sources
Dynamic routing of customers with general delay costs [PDF]
We consider a network of parallel service stations each modelled as a single server queue. Each station serves its own dedicated customers as well as generic customers who are routed from a central controller.
Glazebrook, K D +7 more
core +1 more source
Reliability of Decision-Making and Reinforcement Learning Computational Parameters. [PDF]
Mkrtchian A, Valton V, Roiser JP.
europepmc +1 more source
RESTLESS BANDIT MARGINAL PRODUCTIVITY INDICES I: SINGLEPROJECT CASE AND OPTIMAL CONTROL OF A MAKE-TO-STOCK M/G/1 QUEUE [PDF]
This paper develops a framework based on convex optimization and economic ideas to formulate and solve by an index policy the problem of optimal dynamic effort allocation to a generic discrete-state restless bandit (i.e. binary-action: work/rest) project,
José Niño-Mora
core
Attenuated Directed Exploration during Reinforcement Learning in Gambling Disorder. [PDF]
Wiehler A, Chakroun K, Peters J.
europepmc +1 more source

