Results 91 to 100 of about 30,788 (321)
Trust Region Policy Optimization for POMDPs [PDF]
We propose Generalized Trust Region Policy Optimization (GTRPO), a policy gradient Reinforcement Learning (RL) algorithm for both Markov decision processes (MDP) and Partially Observable Markov Decision Processes (POMDP).
Anandkumar, Animashree +2 more
core
Quadrotor unmanned aerial vehicle control is critical to maintain flight safety and efficiency, especially when facing external disturbances and model uncertainties. This article presents a robust reinforcement learning control scheme to deal with these challenges.
Yu Cai +3 more
wiley +1 more source
Compositional Shield Synthesis for Safe Reinforcement Learning in Partial Observability
Agents controlled by the output of reinforcement learning (RL) algorithms often transition to unsafe states, particularly in uncertain and partially observable environments.
Steven Carr +2 more
doaj +1 more source
This study presents a multitask strategy for plastic cleanup with autonomous surface vehicles, combining exploration and cleaning phases. A two‐headed Deep Q‐Network shared by all agents is traineded via multiobjective reinforcement learning, producing a Pareto front of trade‐offs.
Dame Seck +4 more
wiley +1 more source
Online algorithms for POMDPs with continuous state, action, and observation spaces
Online solvers for partially observable Markov decision processes have been applied to problems with large discrete state spaces, but continuous state, action, and observation spaces remain a challenge.
Kochenderfer, Mykel, Sunberg, Zachary
core +1 more source
Improving Training Result of Partially Observable Markov Decision Process by Filtering Beliefs [PDF]
Oscar LiJen Hsu
openalex +1 more source
The polymerase chain reaction (PCR).Perturbation Theory and Machine Learning framework integrates perturbation theory and machine learning to classify genetic sequences, distinguishing ancient DNA from modern controls and predicting tree health from soil metagenomic data.
Jose L. Rodriguez +19 more
wiley +1 more source
Entropy-Regularized Partially Observed Markov Decision Processes
We investigate partially observed Markov decision processes (POMDPs) with cost functions regularized by entropy terms describing state, observation, and control uncertainty. Standard POMDP techniques are shown to offer bounded-error solutions to these entropy-regularized POMDPs, with exact solutions possible when the regularization involves the joint ...
Timothy L. Molloy, Girish N. Nair
openaire +2 more sources
Collaborative Multiagent Closed‐Loop Motion Planning for Multimanipulator Systems
This work presents a hierarchical multi‐manipulator planner, emphasizing highly overlapping space. The proposed method leverages an enhanced Dynamic Movement Primitive based planner along with an improvised Multi‐Agent Reinforcement Learning approach to ensure regulatory and mediatory control while ensuring low‐level autonomy. Experiments across varied
Tian Xu, Siddharth Singh, Qing Chang
wiley +1 more source
Deciding the Value 1 Problem for #-acyclic Partially Observable Markov Decision Processes
The value 1 problem is a natural decision problem in algorithmic game theory. For partially observable Markov decision processes with reachability objective, this problem is defined as follows: are there strategies that achieve the reachability objective
Gimbert, Hugo, Oualhadj, Youssouf
core +1 more source

