Results 41 to 50 of about 3,144 (205)
On Interpolating Experts and Multi-Armed Bandits
Learning with expert advice and multi-armed bandit are two classic online decision problems which differ on how the information is observed in each round of the game. We study a family of problems interpolating the two. For a vector $\mathbf{m}=(m_1,\dots,m_K)\in \mathbb{N}^K$, an instance of $\mathbf{m}$-MAB indicates that the arms are partitioned ...
Houshuang Chen +2 more
openaire +3 more sources
Combinatorial online learning based on optimizing feedbacks
Combinatorial online learning studies how to learn the unknown parameters and gradually find the optimal combination of targets during the interactions with the environment.This problem has a wide range of applications including advertisement placement ...
Fang KONG +3 more
doaj
Enhancing Dynamic Movie Recommendations With User Expectation Ratings in Contextual Bandit Models [PDF]
Modern movie recommendation systems face challenges such as dynamic personalization and real-time adaptability. Traditional methods like collaborative filtering and content-based recommendations struggle with dynamic user preferences and cold-start ...
Sun Weiye
doaj +1 more source
Maintaining frequency stability is a crucial but challenging task for the stable operation of a power system. The distributed energy storage (DES) can charge or discharge for both upward and downward frequency regulation, exploring and effectively using ...
Jianfeng Sun +5 more
doaj +1 more source
Skyline Identification in Multi-Arm Bandits
We introduce a variant of the classical PAC multi-armed bandit problem. There is an ordered set of $n$ arms $A[1],\dots,A[n]$, each with some stochastic reward drawn from some unknown bounded distribution. The goal is to identify the $skyline$ of the set $A$, consisting of all arms $A[i]$ such that $A[i]$ has larger expected reward than all lower ...
Albert Cheu +2 more
openaire +2 more sources
Adaptive Noise Exploration for Neural Contextual Multi-Armed Bandits
In contextual multi-armed bandits, the relationship between contextual information and rewards is typically unknown, complicating the trade-off between exploration and exploitation.
Chi Wang, Lin Shi, Junru Luo
doaj +1 more source
Active Inference-Driven Multi-Armed Bandits: Superior Performance through Dynamic Correlation Adjustments [PDF]
In recent years, Multi-Armed Bandit (MAB) algorithms have gained substantial attention due to their effectiveness in real-world applications, such as recommendation systems, autonomous systems, and dynamic resource allocation. Traditional MAB algorithms,
Lin Xiaoqi
doaj +1 more source
Discrete Choice Multi-Armed Bandits
This paper establishes a connection between a category of discrete choice models and the realms of online learning and multiarmed bandit algorithms. Our contributions can be summarized in two key aspects. Firstly, we furnish sublinear regret bounds for a comprehensive family of algorithms, encompassing the Exp3 algorithm as a particular case. Secondly,
Emerson Melo, David Müller 0007
openaire +2 more sources
Selective Reviews of Bandit Problems in AI via a Statistical View
Reinforcement Learning (RL) is a widely researched area in artificial intelligence that focuses on teaching agents decision-making through interactions with their environment.
Pengjie Zhou, Haoyu Wei, Huiming Zhang
doaj +1 more source
One-Bit Feedback Exponential Learning for Beam Alignment in Mobile mmWave
Efficient beam alignment in wireless networks capable of supporting device mobility is currently one of the major challenges in mmWave communications. In this context, we formulate the beam-alignment problem via the adversarial multi-armed bandit (MAB ...
Irched Chafaa +2 more
doaj +1 more source

