Results 51 to 60 of about 3,144 (205)
Finding structure in multi-armed bandits [PDF]
Abstract How do humans search for rewards? This question is commonly studied using multi-armed bandit tasks, which require participants to trade off exploration and exploitation. Standard multi-armed bandits assume that each option has an independent reward distribution.
Schulz, Eric +2 more
openaire +4 more sources
We consider multipath TCP (MPTCP) flows over the data networking dynamics of IEEE 802.11ay for drone surveillance of areas using high-definition video streaming.
Shiva Raj Pokhrel, Michel Mandjes
doaj +1 more source
Flickering Multi-Armed Bandits
We introduce Flickering Multi-Armed Bandits (FMAB) to model sequential decision-making in environments with changing action availability, where accessibility of the next action is restricted to a subset dependent on the agent's current choice. We formalize these constraints through stochastically evolving graphs where actions are limited to local ...
Sourav Chakraborty 0009 +3 more
openaire +2 more sources
Compositional generalization in multi-armed bandits
To what extent do human reward learning and decision-making rely on the ability to represent and generate richly structured relationships between options? We provide evidence that structure learning and the principle of compositionality play crucial roles in human reinforcement learning.
Saanum, Tankred +2 more
openaire +4 more sources
Preamble Selection Probability Optimization in RACH: A Multi-Armed Bandits Approach
The use of cellular networks for massive machine-type communications (mMTC), is an attractive solution due to the availability of existing infrastructure. However, the sheer number of user equipments (UEs) creates congestion and overloading challenges on
Ahmed O. Elmeligy +2 more
doaj +1 more source
Multi-armed Bandit Learning on a Graph
The multi-armed bandit(MAB) problem is a simple yet powerful framework that has been extensively studied in the context of decision-making under uncertainty. In many real-world applications, such as robotic applications, selecting an arm corresponds to a physical action that constrains the choices of the next available arms (actions). Motivated by this,
Tianpeng Zhang +2 more
openaire +2 more sources
Making decisions is part and parcel of being human. Among a set of actions, we want to choose the one that has the highest reward. But the uncertainty of the outcome prevents us from always making the right decision. Making decisions under uncertainty can be studied in a principled way by the exploitation-exploration framework.
openaire +1 more source
Gaussian Process with Vine Copula-Based Context Modeling for Contextual Multi-Armed Bandits
We propose a novel contextual multi-armed bandit (CMAB) framework that integrates copula-based context generation with Gaussian Process (GP) regression for reward modeling, addressing complex dependency structures and uncertainty in sequential decision ...
Jong-Min Kim
doaj +1 more source
Fast Two-Stage Computation of an Index Policy for Multi-Armed Bandits with Setup Delays
We consider the multi-armed bandit problem with penalties for switching that include setup delays and costs, extending the former results of the author for the special case with no switching delays.
José Niño-Mora
doaj +1 more source
Multi-Armed Bandits with Interference
Experimentation with interference poses a significant challenge in contemporary online platforms. Prior research on experimentation with interference has concentrated on the final output of a policy. The cumulative performance, while equally crucial, is less well understood. To address this gap, we introduce the problem of {\em Multi-armed Bandits with
Su Jia, Peter I. Frazier, Nathan Kallus
openaire +2 more sources

