Results 121 to 130 of about 33,931 (272)

Several Remarks on the Role of Certain Positional and Social Games in the Creation of the Selected Statistical and Economic Applications

open access: yesFoundations of Management, 2016
The game theory was created on the basis of social as well as gambling games, such as chess, poker, baccarat, hex, or one-armed bandit. The aforementioned games lay solid foundations for analogous mathematical models (e.g., hex), artificial intelligence ...
Drabik Ewa
doaj   +1 more source

Be Greedy in Multi-Armed Bandits [PDF]

open access: green, 2021
Matthieu Jedor   +2 more
openalex   +1 more source

Stationary Multi Choice Bandit Problems [PDF]

open access: yes
This note shows that the optimal choice of k simultaneous experiments in a stationary multi-armed bandit problem can be characterized in terms of the Gittins index of each arm.
Dirk Bergemann, Juuso Vaimaki
core  

A Comparative Study of UCB and Thompson Sampling with Structured Rewards: Parameter Sensitivity and Robustness [PDF]

open access: yesITM Web of Conferences
The behavior of multi-armed bandit (MAB) algorithms is closely tied to how their hyperparameters are set, but their stability in structured reward environments has not been examined in depth.
Chen Yutong
doaj   +1 more source

muMAB: A Multi-Armed Bandit Model for Wireless Network Selection

open access: yesAlgorithms, 2018
Multi-armed bandit (MAB) models are a viable approach to describe the problem of best wireless network selection by a multi-Radio Access Technology (multi-RAT) device, with the goal of maximizing the quality perceived by the final user. The classical MAB
Stefano Boldrini   +5 more
doaj   +1 more source

On Kernelized Multi-armed Bandits

open access: yes, 2017
We consider the stochastic bandit problem with a continuous set of arms, with the expected reward function over the arms assumed to be fixed but unknown. We provide two new Gaussian process-based algorithms for continuous bandit optimization-Improved GP-UCB (IGP-UCB) and GP-Thomson sampling (GP-TS), and derive corresponding regret bounds. Specifically,
Chowdhury, Sayak Ray, Gopalan, Aditya
openaire   +2 more sources

Home - About - Disclaimer - Privacy