Results 241 to 250 of about 33,931 (272)
Some of the next articles are maybe not open access.

Adversarial Multi-armed Bandit

2016
In this chapter, we consider the adversarial MAB problem, a variant of the MAB problems whereby the stochastic assumption about the processes of rewards is removed. We first introduce the problem and define new notations of regret. We then describe a few well-known algorithms for this problem and provide the asymptotic performance results for these ...
Rong Zheng, Cunqing Hua
openaire   +1 more source

Stochastic Multi-armed Bandit

2016
In this chapter, we present the formulation, theoretical bound, and algorithms for the stochastic MAB problem. Several important variants of stochastic MAB and their algorithms are also discussed including multiplay MAB, MAB with switching costs, and pure exploration MAB.
Rong Zheng, Cunqing Hua
openaire   +1 more source

UAV-Assisted Emergency Communications: An Extended Multi-Armed Bandit Perspective

IEEE Communications Letters, 2019
In this letter, we investigate how a rotary-wing unmanned-aerial vehicle (UAV) acts as a wireless base station to provide emergency communication service for a post-disaster area with unknown user distribution. The formulated optimization task is to find
Yu Lin, Tianyu Wang, Shaowei Wang
semanticscholar   +1 more source

Markov Multi-armed Bandit

2016
In many application domains, temporal changes in the reward distribution structure are modeled as a Markov chain. In this chapter, we present the formulation, theoretical bound, and algorithms for the Markov MAB problem, where the rewards are characterized by unknown irreducible Markov processes.
Rong Zheng, Cunqing Hua
openaire   +1 more source

Multi-Armed Bandit Processes

2014
This chapter studies the powerful tool for stochastic scheduling, using theoretically elegant multi-armed bandit processes to maximize expected total discounted rewards. Multi-armed bandit models form a particular type of optimal resource allocation problems, in which a number of machines or processors are to be allocated to serve a set of competing ...
Xiaoqiang Cai, Xianyi Wu, Xian Zhou
openaire   +1 more source

Push-Sum Distributed Online Optimization With Bandit Feedback

IEEE Transactions on Cybernetics, 2022
Cong Wang, Shengyuan Xu, Deming Yuan
exaly  

Dex-Net 1.0: A cloud-based network of 3D objects for robust grasp planning using a Multi-Armed Bandit model with correlated rewards

IEEE International Conference on Robotics and Automation, 2016
Jeffrey Mahler   +9 more
semanticscholar   +1 more source

Multi-Armed Bandit Problems

2008
Aditya Mahajan, Demosthenis Teneketzis
openaire   +1 more source

Home - About - Disclaimer - Privacy