Results 101 to 110 of about 5,586 (174)

Evaluation of performance: multi-armed bandit vs. contextual bandit [PDF]

open access: yes
Master of ScienceDepartment of Computer ScienceWilliam H. HsuThis work compares two methods, the multi-armed bandit (MAB) and contextual multi-armed bandit (CMAB), for action recommendation in a sequential decision making domain.
Chatterjee, Ranojoy
core  

Data from: Risk-aware multi-armed bandit problem with application to portfolio selection

open access: yes, 2017
Sequential portfolio selection has attracted increasing interests in the machine learning and quantitative finance communities in recent years. As a mathematical framework for reinforcement learning policies, the stochastic multi-armed bandit problem ...
Huo, Xiaoguang, Fu, Feng
core   +1 more source

RESTLESS BANDIT MARGINAL PRODUCTIVITY INDICES II: MULTIPROJECT CASE AND SCHEDULING A MULTICLASS MAKE-TO-ORDER/-STOCK M/G/1 QUEUE [PDF]

open access: yes
This paper develops a framework based on convex optimization and economic ideas to formulate and solve approximately a rich class of dynamic and stochastic resource allocation problems, fitting in a generic discrete-state multi-project restless bandit ...
José Niño-Mora
core  

Human behavior in contextual multi-armed bandit problems [PDF]

open access: yes, 2015
In real-life decision environments people learn from their di-rect experience with alternative courses of action. Yet they can accelerate their learning by using functional knowledge about the features characterizing the alternatives. We designed a novel
Stojic, Hrvoje   +5 more
core  

Multi-Armed Bandit Networks: Exploring Online Learning with Networks [PDF]

open access: yes, 2018
Classical Multi-Armed Bandit solutions often assumes independent arms as a simplification of the problem. This has shown great results in many different fields of practice, but could in some cases, presumably leave untapped potential.
Hansen, Viktor
core  

Regret Lower Bounds in Multi-agent Multi-armed Bandit

open access: yes, 2023
Multi-armed Bandit motivates methods with provable upper bounds on regret and also the counterpart lower bounds have been extensively studied in this context.
Klabjan, Diego, Xu, Mengfan
core  

Home - About - Disclaimer - Privacy