Results 241 to 250 of about 33,931 (272)
Some of the next articles are maybe not open access.
Adversarial Multi-armed Bandit
2016In this chapter, we consider the adversarial MAB problem, a variant of the MAB problems whereby the stochastic assumption about the processes of rewards is removed. We first introduce the problem and define new notations of regret. We then describe a few well-known algorithms for this problem and provide the asymptotic performance results for these ...
Rong Zheng, Cunqing Hua
openaire +1 more source
2016
In this chapter, we present the formulation, theoretical bound, and algorithms for the stochastic MAB problem. Several important variants of stochastic MAB and their algorithms are also discussed including multiplay MAB, MAB with switching costs, and pure exploration MAB.
Rong Zheng, Cunqing Hua
openaire +1 more source
In this chapter, we present the formulation, theoretical bound, and algorithms for the stochastic MAB problem. Several important variants of stochastic MAB and their algorithms are also discussed including multiplay MAB, MAB with switching costs, and pure exploration MAB.
Rong Zheng, Cunqing Hua
openaire +1 more source
UAV-Assisted Emergency Communications: An Extended Multi-Armed Bandit Perspective
IEEE Communications Letters, 2019In this letter, we investigate how a rotary-wing unmanned-aerial vehicle (UAV) acts as a wireless base station to provide emergency communication service for a post-disaster area with unknown user distribution. The formulated optimization task is to find
Yu Lin, Tianyu Wang, Shaowei Wang
semanticscholar +1 more source
2016
In many application domains, temporal changes in the reward distribution structure are modeled as a Markov chain. In this chapter, we present the formulation, theoretical bound, and algorithms for the Markov MAB problem, where the rewards are characterized by unknown irreducible Markov processes.
Rong Zheng, Cunqing Hua
openaire +1 more source
In many application domains, temporal changes in the reward distribution structure are modeled as a Markov chain. In this chapter, we present the formulation, theoretical bound, and algorithms for the Markov MAB problem, where the rewards are characterized by unknown irreducible Markov processes.
Rong Zheng, Cunqing Hua
openaire +1 more source
2014
This chapter studies the powerful tool for stochastic scheduling, using theoretically elegant multi-armed bandit processes to maximize expected total discounted rewards. Multi-armed bandit models form a particular type of optimal resource allocation problems, in which a number of machines or processors are to be allocated to serve a set of competing ...
Xiaoqiang Cai, Xianyi Wu, Xian Zhou
openaire +1 more source
This chapter studies the powerful tool for stochastic scheduling, using theoretically elegant multi-armed bandit processes to maximize expected total discounted rewards. Multi-armed bandit models form a particular type of optimal resource allocation problems, in which a number of machines or processors are to be allocated to serve a set of competing ...
Xiaoqiang Cai, Xianyi Wu, Xian Zhou
openaire +1 more source
Push-Sum Distributed Online Optimization With Bandit Feedback
IEEE Transactions on Cybernetics, 2022Cong Wang, Shengyuan Xu, Deming Yuan
exaly
IEEE International Conference on Robotics and Automation, 2016
Jeffrey Mahler +9 more
semanticscholar +1 more source
Jeffrey Mahler +9 more
semanticscholar +1 more source

