Gittins index - Open Access .click

Results 71 to 80 of about 313 (110)

A (2/3)n³ Fast-Pivoting Algorithm for the Gittins Index and Optimal Stopping of a Markov Chain

, 2007
This paper presents a new fast-pivoting algorithm that computes the n Gittins index values of an n-state bandit—in the discounted and undiscounted cases—by performing 2/3 n3 + O n2 arithmetic operations, thus attaining better complexity than previous ...
José Niño-mora
core

Marginal productivity index policies for scheduling a multiclass delay-/loss-sensitive queue [PDF]

, 2005
We address the problem of scheduling a multiclass M/M/1 queue with a finite dedicated buffer for each class. Some classes are delay-sensitive, modeling real-time traffic (e.g.
Niño-Mora, José, José Niño-mora, Niño Mora, José +2 more
core

On the Behavior of Proposers in Ultimatum Games [PDF]

We demonstrate that one should not expect convergence of the proposals to the subgame perfect Nash equilibrium offer in standard ultimatum games. First, imposing strict experimental control of the behavior of the receiving players and focusing on the ...
Nicolaas J. Vriend, Thomas Brenner
core +2 more sources

Q-Learning for Bandit Problems

, 1995
Multi-armed bandits may be viewed as decompositionally-structured Markov decision processes (MDP's) with potentially verylarge state sets. A particularly elegant methodology for computing optimal policies was developed over twenty ago by Gittins ...
Michael O. Duff
core +1 more source

The Learning Component of Dynamic Allocation Indices

, 1992
*This article is free to read on the publisher's website*\ud \ud For a multiarmed bandit problem with exponential discounting the optimal allocation rule is defined by a dynamic allocation index defined for each arm on its space.
Gittins, J., Wang, Y-G., Wang, Y. G.
core +1 more source

On the role of Gittins index in singular stochastic control: semi-explicit solutions via the Wiener-Hopf factorisation

, 2014
This paper examines a class of singular stochastic control problems with convex objective functions. In Section 2, we use tools from convex analysis to derive necessary and sufficient first order conditions for this class of optimisation problems. The main result of this paper is Theorem 9 which uses results from optimal stopping to establish the link ...
openaire +2 more sources

MARGINAL PRODUCTIVITY INDEX POLICIES FOR SCHEDULING A MULTICLASS DELAY-/LOSS-SENSITIVE QUEUE [PDF]

We address the problem of scheduling a multiclass M/M/1 queue with a finite dedicated buffer for each class. Some classes are delay-sensitive, modeling real-time traffic (e.g.
Jose Niño-Mora
core

Stationary Multi Choice Bandit Problems [PDF]

This note shows that the optimal choice of k simultaneous experiments in a stationary multi-armed bandit problem can be characterized in terms of the Gittins index of each arm.
Dirk Bergemann, Juuso Vaimaki
core

The performance of forwards induction policies

Following major theoretical advances in the study of multi-armed bandit problems, Gittins proposed a forwards induction (FI) approach to the development of policies for Markov decision processes (MDP's).
Gittins, J. C., Glazebrook, K. D.
core

Stochastic scheduling: A short history of index policies and some key recent developments

, 2011
A multi-armed bandit problem classically concerns N >= 2 populations of rewards whose statistical properties are unknown (or at least only partly known).
Minty, John +3 more
core

multi-armed bandit
markov and semi-markov decision processes
fos: computer and information sciences

fos: mathematics
optimization and control math.oc
16. peace & justice

multi-armed bandits
queues and service in operations research
machine learning cs.lg

previous 6 7 8 9 10 next