Multi-armed bandits - Open Access .click

Results 31 to 40 of about 3,144 (205)

Multi-Armed Bandits and Their Classical Algorithms: Strengths and Limitations [PDF]

ITM Web of Conferences
Multi-armed Bandits (MAB), which is the short form for multi- armed bandit, is playing a significant role in varied spaces including online advertising, recommendation system, clinical trials and medical decision- making and even in 5G. The model aims to
Wu Sichen
doaj +1 more source

COM-MABs: From Users' Feedback to Recommendation

Proceedings of the International Florida Artificial Intelligence Research Society Conference, 2022
Recently, the COMbinatorial Multi-Armed Bandits (COM-MAB) problem has arisen as an active research field. In systems interacting with humans, those reinforcement learning approaches use a feedback strategy as their reward function.
Alexandre Letard +3 more
doaj +1 more source

Multi-armed Bandits with Compensation

CoRR, 2018
We propose and study the known-compensation multi-arm bandit (KCMAB) problem, where a system controller offers a set of arms to many short-term players for $T$ steps. In each step, one short-term player arrives to the system. Upon arrival, the player aims to select an arm with the current best average reward and receives a stochastic reward associated ...
Siwei Wang 0002, Longbo Huang
openaire +3 more sources

Rested and Restless Bandits With Constrained Arms and Hidden States: Applications in Social Networks and 5G Networks

IEEE Access, 2018
The problem of rested and restless multi-armed bandits with constrained availability (RMAB-CA) of arms is considered. The states of arms evolve in Markovian manner and the exact states are hidden from the decision maker. First, some structural results on
Varun Mehta +4 more
doaj +1 more source

Reinforcement Learning Based Beamforming Jammer for Unknown Wireless Networks

IEEE Access, 2020
A jamming attack refers to adversarial activities to cause an interruption of communication among legitimate nodes in a wireless network by transmitting a jamming signal.
Gyungmin Kim, Hyuk Lim
doaj +1 more source

Multi-armed bandits in metric spaces [PDF]

Proceedings of the fortieth annual ACM symposium on Theory of computing, 2008
In a multi-armed bandit problem, an online algorithm chooses from a set of strategies in a sequence of trials so as to maximize the total payoff of the chosen strategies. While the performance of bandit algorithms with a small finite strategy set is quite well understood, bandit problems with large strategy sets are still a topic of very active ...
Robert Kleinberg, Aleksandrs Slivkins, Eli Upfal +2 more
openaire +2 more sources

Regional Multi-Armed Bandits

CoRR, 2018
We consider a variant of the classic multi-armed bandit problem where the expected reward of each arm is a function of an unknown parameter. The arms are divided into different groups, each of which has a common parameter. Therefore, when the player selects an arm at each time slot, information of other arms in the same group is also revealed.
Zhiyang Wang, Ruida Zhou, Cong Shen 0001
openaire +3 more sources

Best Agent Identification for General Game Playing

IEEE Access
We present an efficient and generalised procedure to accurately identify the best (or near best) performing algorithm for each sub-task in a multi-problem domain.
Matthew Stephenson +3 more
doaj +1 more source

Context-aware Multi-stakeholder Recommender Systems

Proceedings of the International Florida Artificial Intelligence Research Society Conference, 2022
Traditional recommender systems help users find the most relevant products or services to match their needs and preferences. However, they overlook the preferences of other sides of the market (aka stakeholders) involved in the system.
Tahereh Arabghalizi, Alexandros Labrinidis +1 more
doaj +1 more source

Multi-armed Bandits with Cost Subsidy

CoRR, 2020
In this paper, we consider a novel variant of the multi-armed bandit (MAB) problem, MAB with cost subsidy, which models many real-life applications where the learning agent has to pay to select an arm and is concerned about optimizing cumulative costs and rewards. We present two applications, intelligent SMS routing problem and ad audience optimization
Deeksha Sinha +3 more
openaire +3 more sources

computer science
mathematics
fos: computer and information sciences

artificial intelligence
mathematical optimization
machine learning cs.lg

computer science - machine learning
machine learning
regret