Results 31 to 40 of about 2,593 (166)
Multi‐Agent Reinforcement Learning With Deep Networks for Diverse Q$Q$‐Vectors
This paper investigates multi‐agent reinforcement learning, where agents possess individual Q$Q$‐values, forming a Q$Q$‐vector. We introduce a deep Q‐networks algorithm that learns Q$Q$‐vectors using Max, Nash, and Maximin strategies. The proposed method is validated in a dual‐arm robotic environment.
Zhenglong Luo +3 more
wiley +1 more source
On the Nash Equilibria of a Simple Discounted Duel [PDF]
We formulate and study a two-player, duel game as a nonzero-sum discounted stochastic game. Players P1, and P2 are standing in place and, in each turn, one or both may shoot at the other player.
Athanasios Kehagias
doaj
This article proposes a novel algorithm to address the security issues in millimeter‐wave Internet‐of‐vehicles (mmWave‐IoV). The main idea is to provide a new solution to eliminate eavesdropping in dynamic mmWave‐IoV infrastructure. For this purpose, a secure multiagent cooperative communication algorithm based on deep deterministic policy gradient ...
Juan Zhang +5 more
wiley +1 more source
Tsallis-INF: An Optimal Algorithm for Stochastic and Adversarial Bandits
We derive an algorithm that achieves the optimal (within constants) pseudo-regret in both adversarial and stochastic multi-armed bandits without prior knowledge of the regime and time horizon.
Seldin, Yevgeny, Zimmert, Julian
core
Towards a Data‐Driven Digital Twin AI‐Based Architecture for Self‐Driving Vehicles
ABSTRACT Recent advancements on digital technologies, particularly artificial intelligence, have been resulted into remarkable transformations in automobile industry. One of these technologies is artificial intelligence (AI). AI plays a key role in the development of autonomous vehicles. In this paper, the role of AI in autonomous vehicle (AV) platform
Parinaz Babaei +3 more
wiley +1 more source
Interpretable Intersection Control by Reinforcement Learning Agent With Linear Function Approximator
In this work, the use of the linear function approximator (FA) for a value‐based reinforcement learning (RL) agent in traffic signal control problems is investigated along with the least‐squares Q‐learning method, abbreviated as LSTDQ. The motivation for using the linear FA is the interpretability property of the controller, which is crucial for RL ...
Somporn Sahachaiseree, Takashi Oguchi
wiley +1 more source
Equilibrium in Two-Player Non-Zero-Sum Dynkin Games in Continuous Time [PDF]
We prove that every two-player non-zero-sum Dynkin game in continuous time admits an epsilon-equilibrium in randomized stopping times. We provide a condition that ensures the existence of an epsilon-equilibrium in non-randomized stopping ...
Laraki, Rida, Solan, Eilon
core
Variance-Aware Regret Bounds for Stochastic Contextual Dueling Bandits
28 pages, 1 ...
Di, Qiwei +5 more
openaire +2 more sources
An Effective Multi‐Agent Reinforcement Learning Algorithm for Urban Traffic Light Scheduling
This study presents a two‐step communication method between agents in the first and second groups that makes it possible for the agents in the same group to make a decision like playing a simultaneous game while letting agents in different groups make a decision like playing a sequential game.
Chun‐Wei Tsai +3 more
wiley +1 more source
IoT‐5G and B5G/6G resource allocation and network slicing orchestration using learning algorithms
In this article, the challenges related to the evolution of 5G and B5G/6G networks are examined more closely. The authors then primarily focus on machine learning solutions for resource allocation in 5G and B5G/6G networks. The requirements for dynamic network slicing orchestration are also analysed.
Ado Adamou Abba Ari +5 more
wiley +1 more source

