Results 41 to 50 of about 13,931 (229)
Goal recognition over POMDPs: inferring the intention of a POMDP agent
Plan recognition is the problem of inferring the goals and plans of an agent from partial observations of her behavior. Recently, it has been shown that the problem can be formulated and solved using/nplanners, reducing plan recognition to plan generation./nIn this work, we extend this model-based/napproach to plan recognition to the POMDP setting ...
Ramirez M., Geffner H.
openaire +2 more sources
Deep Reinforcement Learning Approach for Trading Automation in the Stock Market
Deep Reinforcement Learning (DRL) algorithms can scale to previously intractable problems. The automation of profit generation in the stock market is possible using DRL, by combining the financial assets price “prediction” step and the ...
Taylan Kabbani, Ekrem Duman
doaj +1 more source
Nonapproximability Results for Partially Observable Markov Decision Processes
We show that for several variations of partially observable Markov decision processes, polynomial-time algorithms for finding control policies are unlikely to or simply don't have guarantees of finding policies within a constant factor or a constant ...
Goldsmith, J., Lusena, C., Mundhenk, M.
core +1 more source
ABSTRACT Personal autonomous vehicles can sense their surrounding environment, plan their route, and drive with little or no involvement of human drivers. Despite the latest technological advancements and the hopeful announcements made by leading entrepreneurs, to date no personal vehicle is approved for road circulation in a “fully” or “semi ...
Xingshuai Dong +13 more
wiley +1 more source
TNCOA: Efficient Exploration via Observation‐Action Constraint on Trajectory‐Based Intrinsic Reward
ABSTRACT Efficient exploration is critical in handling sparse rewards and partial observability in deep reinforcement learning. However, most existing intrinsic reward methods based on novelty rely on single‐step observations or Euclidean distances.
Jingxiang Ma, Hongbin Ma, Youzhi Zhang
wiley +1 more source
This note presents an analytical framework for decision-making in drone swarm systems operating under uncertainty, based on the integration of Partially Observable Markov Decision Processes (POMDP) with Deep Deterministic Policy Gradient (DDPG ...
M. Z. Zgurovsky +2 more
doaj +1 more source
Mixed Integer Linear Programming For Exact Finite-Horizon Planning In Decentralized Pomdps [PDF]
We consider the problem of finding an n-agent joint-policy for the optimal finite-horizon control of a decentralized Pomdp (Dec-Pomdp). This is a problem of very high complexity (NEXP-hard in n >= 2).
Aras, Raghav +2 more
core +3 more sources
Medical Knowledge Integration Into Reinforcement Learning Algorithms for Dynamic Treatment Regimes
Summary The goal of precision medicine is to provide individualised treatment at each stage of chronic diseases, a concept formalised by dynamic treatment regimes (DTR). These regimes adapt treatment strategies based on decision rules learned from clinical data to enhance therapeutic effectiveness.
Sophia Yazzourh +3 more
wiley +1 more source
During the last decades, collaborative robots capable of operating out of their cages are widely used in industry to assist humans in mundane and harsh manufacturing tasks. Although such robots are inherently safe by design, they are commonly accompanied
Angeliki Zacharaki +2 more
doaj +1 more source
Recent advances in autonomy of unmanned aerial vehicles (UAVs) have increased their use in remote sensing applications, such as precision agriculture, biosecurity, disaster monitoring, and surveillance.
Juan Sandino +4 more
doaj +1 more source

