Q-learning - Open Access .click

Results 231 to 240 of about 212,484 (267)

Some of the next articles are maybe not open access.

Neural Computing & Applications, 2003
In this paper we introduce a novel neural reinforcement learning method. Unlike existing methods, our approach does not need a model of the system and can be trained directly using the measurements of the system. We achieve this by only using one function approximator and approximate the improved policy from this.
ten Hagen, S.H.G., Kröse, B.J.A.
openaire +2 more sources

Q-learning automaton

IEEE/WIC International Conference on Intelligent Agent Technology, 2003. IAT 2003., 2004
Reinforcement learning is the problem faced by a controller that must learn behavior through trial and error interactions with a dynamic environment. The controller's goal is to maximize reward over time, by producing an effective mapping of states of actions called policy.
Fei Qian, Hironori Hirata
openaire +1 more source

Quad-Q-learning

IEEE Transactions on Neural Networks, 2000
This paper develops the theory of quad-Q-learning which is a new learning algorithm that evolved from Q-learning. Quad-Q-learning is applicable to problems that can be solved by "divide and conquer" techniques. Quad-Q-learning concerns an autonomous agent that learns without supervision to act optimally to achieve specified goals.
Clifford Clausen, Harry Wechsler
openaire +2 more sources

Backward Q-learning: The combination of Sarsa algorithm and Q-learning

Engineering Applications of Artificial Intelligence, 2013
Reinforcement learning (RL) has been applied to many fields and applications, but there are still some dilemmas between exploration and exploitation strategy for action selection policy. The well-known areas of reinforcement learning are the Q-learning and the Sarsa algorithms, but they possess different characteristics.
Yin-Hao Wang, Tzuu-Hseng S. Li, Chih-Jui Lin +2 more
openaire +1 more source

Accurate Q-Learning

2018
In order to solve the problem that Q-learning can suffer from large overestimations in some stochastic environments, we first propose a new form of Q-learning, which proves that it is equivalent to the incremental form and analyze the reasons why the convergence rate of Q-learning will be affected by positive bias.
Zhihui Hu +3 more
openaire +1 more source

Underestimation estimators to Q-learning

Information Sciences, 2022
zbMATH Open Web Interface contents unavailable due to conflicting licenses.
Patigül Abliz, Shi Ying
openaire +2 more sources

Learning Rates for Q-Learning

2001
In this paper we derive convergence rates for Q-learning. We show an interesting relationship between the convergence rate and the learning rate used in Q-learning. For a polynomial learning rate, one which is 1/tω at time t where ω∈(1/2,1), we show that the convergence rate is polynomial in 1/(1-γ), where γ is the discount factor.
Eyal Even-Dar, Yishay Mansour
openaire +2 more sources

Boundedness of iterates in Q-Learning

Systems & Control Letters, 2006
zbMATH Open Web Interface contents unavailable due to conflicting licenses.
openaire +1 more source

Higher order Q-Learning

2011 IEEE Symposium on Adaptive Dynamic Programming and Reinforcement Learning (ADPRL), 2011
Higher order learning is a statistical relational learning framework in which relationships between different instances of the same class are leveraged (Ganiz, Lytkin and Pottenger, 2009). Learning can be supervised or unsupervised. In contrast, reinforcement learning (Q-Learning) is a technique for learning in an unknown state space.
Ashley Edwards, William M. Pottenger
openaire +1 more source

Fuzzy Q-learning and dynamical fuzzy Q-learning

Proceedings of 1994 IEEE 3rd International Fuzzy Systems Conference, 1994
This paper proposes two reinforcement-based learning algorithms: 1) fuzzy Q-learning in an adaptation of Watkins' Q-learning for fuzzy inference systems; and 2) dynamical fuzzy Q-learning which eliminates some drawbacks of both Q-learning and fuzzy Q-learning. These algorithms are used to improve the rule base of a fuzzy controller. >
openaire +1 more source

reinforcement learning
deep reinforcement learning
artificial intelligence

machine learning
path planning