Multi-armed bandits - Open Access .click

Results 41 to 50 of about 3,144 (205)

On Interpolating Experts and Multi-Armed Bandits

CoRR, 2023
Learning with expert advice and multi-armed bandit are two classic online decision problems which differ on how the information is observed in each round of the game. We study a family of problems interpolating the two. For a vector $\mathbf{m}=(m_1,\dots,m_K)\in \mathbb{N}^K$, an instance of $\mathbf{m}$-MAB indicates that the arms are partitioned ...
Houshuang Chen, Yuchen He 0006, Chihao Zhang 0001 +2 more
openaire +3 more sources

Combinatorial online learning based on optimizing feedbacks

大数据, 2021
Combinatorial online learning studies how to learn the unknown parameters and gradually find the optimal combination of targets during the interactions with the environment.This problem has a wide range of applications including advertisement placement ...
Fang KONG, Yueran YANG, Wei CHEN, Shuai LI +3 more
doaj

Enhancing Dynamic Movie Recommendations With User Expectation Ratings in Contextual Bandit Models [PDF]

ITM Web of Conferences
Modern movie recommendation systems face challenges such as dynamic personalization and real-time adaptability. Traditional methods like collaborative filtering and content-based recommendations struggle with dynamic user preferences and cold-start ...
Sun Weiye
doaj +1 more source

A dynamic distributed energy storage control strategy for providing primary frequency regulation using multi‐armed bandits method

IET Generation, Transmission & Distribution, 2022
Maintaining frequency stability is a crucial but challenging task for the stable operation of a power system. The distributed energy storage (DES) can charge or discharge for both upward and downward frequency regulation, exploring and effectively using ...
Jianfeng Sun +5 more
doaj +1 more source

Skyline Identification in Multi-Arm Bandits

2018 IEEE International Symposium on Information Theory (ISIT), 2018
We introduce a variant of the classical PAC multi-armed bandit problem. There is an ordered set of $n$ arms $A[1],\dots,A[n]$, each with some stochastic reward drawn from some unknown bounded distribution. The goal is to identify the $skyline$ of the set $A$, consisting of all arms $A[i]$ such that $A[i]$ has larger expected reward than all lower ...
Albert Cheu, Ravi Sundaram, Jonathan R. Ullman +2 more
openaire +2 more sources

Adaptive Noise Exploration for Neural Contextual Multi-Armed Bandits

Algorithms
In contextual multi-armed bandits, the relationship between contextual information and rewards is typically unknown, complicating the trade-off between exploration and exploitation.
Chi Wang, Lin Shi, Junru Luo
doaj +1 more source

Active Inference-Driven Multi-Armed Bandits: Superior Performance through Dynamic Correlation Adjustments [PDF]

ITM Web of Conferences
In recent years, Multi-Armed Bandit (MAB) algorithms have gained substantial attention due to their effectiveness in real-world applications, such as recommendation systems, autonomous systems, and dynamic resource allocation. Traditional MAB algorithms,
Lin Xiaoqi
doaj +1 more source

Discrete Choice Multi-Armed Bandits

CoRR, 2023
This paper establishes a connection between a category of discrete choice models and the realms of online learning and multiarmed bandit algorithms. Our contributions can be summarized in two key aspects. Firstly, we furnish sublinear regret bounds for a comprehensive family of algorithms, encompassing the Exp3 algorithm as a particular case. Secondly,
Emerson Melo, David Müller 0007
openaire +2 more sources

Selective Reviews of Bandit Problems in AI via a Statistical View

Mathematics
Reinforcement Learning (RL) is a widely researched area in artificial intelligence that focuses on teaching agents decision-making through interactions with their environment.
Pengjie Zhou, Haoyu Wei, Huiming Zhang
doaj +1 more source

One-Bit Feedback Exponential Learning for Beam Alignment in Mobile mmWave

IEEE Access, 2020
Efficient beam alignment in wireless networks capable of supporting device mobility is currently one of the major challenges in mmWave communications. In this context, we formulate the beam-alignment problem via the adversarial multi-armed bandit (MAB ...
Irched Chafaa, E. Veronica Belmega, Merouane Debbah +2 more
doaj +1 more source

computer science
mathematics
fos: computer and information sciences

artificial intelligence
mathematical optimization
machine learning cs.lg

computer science - machine learning
machine learning
regret