Results 71 to 80 of about 120,210 (259)
Nonstationary Continuous Time Markov Decision Processes with Discounted Criterion
zbMATH Open Web Interface contents unavailable due to conflicting licenses.
openaire +1 more source
Overcoming the Nyquist Limit in Molecular Hyperspectral Imaging by Reinforcement Learning
Explorative spectral acquisition guide automatically selects informative spectral bands to optimize downstream tasks, outperforming full‐spectrum acquisition. The selected hyperspectral data are used for tasks such as unmixing and segmentation. BandOptiNet encodes selection states and outputs optimal bands to guide spectral acquisition. Recent advances
Xiaobin Tang +4 more
wiley +1 more source
Quadrotor unmanned aerial vehicle control is critical to maintain flight safety and efficiency, especially when facing external disturbances and model uncertainties. This article presents a robust reinforcement learning control scheme to deal with these challenges.
Yu Cai +3 more
wiley +1 more source
Partially Observable Risk-Sensitive Markov Decision Processes
We consider the problem of minimizing a certainty equivalent of the total or discounted cost over a finite and an infinite time horizon which is generated by a Partially Observable Markov Decision Process (POMDP).
Bäuerle, Nicole, Rieder, Ulrich
core +1 more source
Large Language Model‐Based Chatbots in Higher Education
The use of large language models (LLMs) in higher education can facilitate personalized learning experiences, advance asynchronized learning, and support instructors, students, and researchers across diverse fields. The development of regulations and guidelines that address ethical and legal issues is essential to ensure safe and responsible adaptation
Defne Yigci +4 more
wiley +1 more source
Variational Autoencoder+Deep Deterministic Policy Gradient addresses low‐light failures of infrared depth sensing for indoor robot navigation. Stage 1 pretrains an attention‐enhanced Variational Autoencoder (Convolutional Block Attention Module+Feature Pyramid Network) to map dark depth frames to a well‐lit reconstruction, yielding a 128‐D latent code ...
Uiseok Lee +7 more
wiley +1 more source
On gradual-impulse control of continuous-time Markov decision processes with multiplicative cost
In this paper, we consider the gradual-impulse control problem of continuous-time Markov decision processes, where the system performance is measured by the expectation of the exponential utility of the total cost. We prove, under very general conditions
Guo, Xin +3 more
core
Extending the Bellman equation for MDPs to continuous actions and continuous time in the discounted case [PDF]
Recent work on Markov Decision Processes (MDPs) covers the use of continuous variables and resources, including time. This work is usually done in a framework of bounded resources and finite temporal horizon for which a total reward criterion is often ...
Fabiani, Patrick +2 more
core
This study introduces a data‐driven framework that combines deep reinforcement learning with classical path planning to achieve adaptive microrobot navigation. By training a surrogate neural network to emulate microrobot dynamics, the approach improves learning efficiency, reduces training time, and enables robust real‐time obstacle avoidance in ...
Amar Salehi +3 more
wiley +1 more source
Deep Reinforcement Learning Approaches for Sensor Data Collection by a Swarm of UAVs
This article presents four decentralized reinforcement learning algorithms for autonomous data harvesting and investigates how collaboration improves collection efficiency. It also presents strategies to minimize training times by improving model flexibility, enabling algorithms to operate with varying number of agents and sensors.
Thiago de Souza Lamenza +2 more
wiley +1 more source

