Partially observable markov decision processes

Results 91 to 100 of about 30,788 (321)

Trust Region Policy Optimization for POMDPs [PDF]

, 2018
We propose Generalized Trust Region Policy Optimization (GTRPO), a policy gradient Reinforcement Learning (RL) algorithm for both Markov decision processes (MDP) and Partially Observable Markov Decision Processes (POMDP).
Anandkumar, Animashree +2 more
core

Robust Reinforcement Learning Control Framework for a Quadrotor Unmanned Aerial Vehicle Using Critic Neural Network

Advanced Intelligent Systems, Volume 7, Issue 3, March 2025.
Quadrotor unmanned aerial vehicle control is critical to maintain flight safety and efficiency, especially when facing external disturbances and model uncertainties. This article presents a robust reinforcement learning control scheme to deal with these challenges.
Yu Cai, Yefeng Yang, Tao Huang, Boyang Li +3 more
wiley +1 more source

Compositional Shield Synthesis for Safe Reinforcement Learning in Partial Observability

IEEE Open Journal of Control Systems
Agents controlled by the output of reinforcement learning (RL) algorithms often transition to unsafe states, particularly in uncertain and partially observable environments.
Steven Carr, Georgios Bakirtzis, Ufuk Topcu +2 more
doaj +1 more source

Multiobjective Environmental Cleanup with Autonomous Surface Vehicle Fleets Using Multitask Multiagent Deep Reinforcement Learning

Advanced Intelligent Systems, EarlyView.
This study presents a multitask strategy for plastic cleanup with autonomous surface vehicles, combining exploration and cleaning phases. A two‐headed Deep Q‐Network shared by all agents is traineded via multiobjective reinforcement learning, producing a Pareto front of trade‐offs.
Dame Seck +4 more
wiley +1 more source

Online algorithms for POMDPs with continuous state, action, and observation spaces

, 2018
Online solvers for partially observable Markov decision processes have been applied to problems with large discrete state spaces, but continuous state, action, and observation spaces remain a challenge.
Kochenderfer, Mykel, Sunberg, Zachary
core +1 more source

Improving Training Result of Partially Observable Markov Decision Process by Filtering Beliefs [PDF]

, 2021
Oscar LiJen Hsu
openalex +1 more source

Polymerase Chain Reaction. Perturbation Theory and Machine Learning Artificial Intelligence‐Experimental Microbiome Analysis: Applications to Ancient DNA and Tree Soil Metagenomics Cases of Study

Advanced Intelligent Systems, EarlyView.
The polymerase chain reaction (PCR).Perturbation Theory and Machine Learning framework integrates perturbation theory and machine learning to classify genetic sequences, distinguishing ancient DNA from modern controls and predicting tree health from soil metagenomic data.
Jose L. Rodriguez +19 more
wiley +1 more source

Entropy-Regularized Partially Observed Markov Decision Processes

IEEE Transactions on Automatic Control
We investigate partially observed Markov decision processes (POMDPs) with cost functions regularized by entropy terms describing state, observation, and control uncertainty. Standard POMDP techniques are shown to offer bounded-error solutions to these entropy-regularized POMDPs, with exact solutions possible when the regularization involves the joint ...
Timothy L. Molloy, Girish N. Nair
openaire +2 more sources

Collaborative Multiagent Closed‐Loop Motion Planning for Multimanipulator Systems

Advanced Intelligent Systems, EarlyView.
This work presents a hierarchical multi‐manipulator planner, emphasizing highly overlapping space. The proposed method leverages an enhanced Dynamic Movement Primitive based planner along with an improvised Multi‐Agent Reinforcement Learning approach to ensure regulatory and mediatory control while ensuring low‐level autonomy. Experiments across varied
Tian Xu, Siddharth Singh, Qing Chang
wiley +1 more source

Deciding the Value 1 Problem for #-acyclic Partially Observable Markov Decision Processes

, 2012
The value 1 problem is a natural decision problem in algorithmic game theory. For partially observable Markov decision processes with reachability objective, this problem is defined as follows: are there strategies that achieve the reachability objective
Gimbert, Hugo, Oualhadj, Youssouf
core +1 more source

pomdp
markov and semi-markov decision processes
dynamic programming

fos: computer and information sciences
markov decision process
reinforcement learning

artificial intelligence cs.ai
computer science - artificial intelligence
deep reinforcement learning