Restless multi-armed bandits - Open Access .click

Results 71 to 80 of about 257 (142)

Restless and Uncertain: Robust Policies for Restless Bandits via Deep Multi-Agent Reinforcement Learning

, 2022
We introduce robustness in \textit{restless multi-armed bandits} (RMABs), a popular model for constrained resource allocation among independent stochastic processes (arms).
Biswas, Arpita +3 more
core

Characterization and computation of restless bandit marginal productivity indices [PDF]

, 2007
The Whittle index [P. Whittle (1988). Restless bandits: Activity allocation in a changing world. J. Appl. Probab. 25A, 287-298] yields a practical scheduling rule for the versatile yet intractable multi-armed restless bandit problem, involving the ...
Niño-Mora, José, José Niño-Mora, Niño Mora, José +2 more
core +1 more source

Multi-Armed Bandits in Brain-Computer Interfaces. [PDF]

Front Hum Neurosci, 2022
Heskebeck F, Bergeling C, Bernhardsson B. +2 more
europepmc +1 more source

Optimal Control of Fluid Restless Multi-armed Bandits: A Machine Learning Approach

Machine Learning
We present a novel machine learning framework for the optimal control of fluid restless multi-armed bandit problems (FRMABPs) with state equations that are either affine or quadratic in the state variables. By establishing fundamental properties of FRMABPs, we develop an efficient numerical algorithm that generates a comprehensive training set by ...
Dimitris Bertsimas, Cheol Woo Kim, José Niño-Mora +2 more
openaire +2 more sources

Signal detection models as contextual bandits. [PDF]

R Soc Open Sci, 2023
Sherratt TN, O'Neill E.
europepmc +1 more source

Minimizing Cost Rather Than Maximizing Reward in Restless Multi-Armed Bandits

CoRR
Restless Multi-Armed Bandits (RMABs) offer a powerful framework for solving resource constrained maximization problems. However, the formulation can be inappropriate for settings where the limiting constraint is a reward threshold rather than a budget.
R. Teal Witter, Lisa Hellerstein
openaire +2 more sources

Dynamic routing of customers with general delay costs [PDF]

, 2009
We consider a network of parallel service stations each modelled as a single server queue. Each station serves its own dedicated customers as well as generic customers who are routed from a central controller.
Glazebrook, K D +7 more
core +1 more source

Reliability of Decision-Making and Reinforcement Learning Computational Parameters. [PDF]

Comput Psychiatr, 2023
Mkrtchian A, Valton V, Roiser JP.
europepmc +1 more source

RESTLESS BANDIT MARGINAL PRODUCTIVITY INDICES I: SINGLEPROJECT CASE AND OPTIMAL CONTROL OF A MAKE-TO-STOCK M/G/1 QUEUE [PDF]

This paper develops a framework based on convex optimization and economic ideas to formulate and solve by an index policy the problem of optimal dynamic effort allocation to a generic discrete-state restless bandit (i.e. binary-action: work/rest) project,
José Niño-Mora
core

Attenuated Directed Exploration during Reinforcement Learning in Gambling Disorder. [PDF]

J Neurosci, 2021
Wiehler A, Chakroun K, Peters J.
europepmc +1 more source

fos: computer and information sciences
machine learning cs.lg
computer science - machine learning

artificial intelligence cs.ai
computer science - artificial intelligence
statistics - machine learning

machine learning stat.ml
3. good health
systems and control eess.sy