Results 51 to 60 of about 199,848 (147)
The lot-streaming flowshop scheduling problem with equal-size sublots (ELFSP) is a significant extension of the classic flowshop scheduling problem, focusing on optimize makespan.
Ping Wang, Renato De Leone, Hongyan Sang
doaj +1 more source
This work addresses bi-objective hybrid flow shop scheduling problems considering consistent sublots (Bi-HFSP_CS). The objectives are to minimize the makespan and total energy consumption.
Benxue Lu +3 more
doaj +1 more source
We introduce a new convergent variant of Q-learning, called speedy Q-learning, to address the problem of slow convergence in the standard form of the Q-learning algorithm. We prove a PAC bound on the performance of SQL, which shows that for an MDP with n state-action pairs and the discount factor γ only T = O(log(n)/(ε^2 (1 - γ)^4)) steps are required ...
Azar, Mohammad Gheshlaghi +3 more
openaire +2 more sources
A Reinforcement Learning Approach for Smart Farming [PDF]
At a basic level, the aim of machine learning is to develop solutions for real-life engineering problems and to enhance the performance of different computers tasks in order to obtain an algorithm that is highly independent of human intervention.
Gabriela ENE
doaj
Reinforcement learning (RL) approaches, particularly Q-learning, have emerged as strong tools for autonomous agent training, allowing agents to acquire optimum decision-making rules through interaction with their surroundings.
Biplov Paneru +3 more
doaj +1 more source
MinMaxMin $Q$-learning is a novel optimistic Actor-Critic algorithm that addresses the problem of overestimation bias ($Q$-estimations are overestimating the real $Q$-values) inherent in conservative RL algorithms. Its core formula relies on the disagreement among $Q$-networks in the form of the min-batch MaxMin $Q$-networks distance which is added to ...
Soffair, Nitsan, Mannor, Shie
openaire +2 more sources
We draw an analogy between static friction in classical mechanics and extrapolation error in off-policy RL, and use it to formulate a constraint that prevents the policy from drifting toward unsupported actions. In this study, we present Frictional Q-learning, a deep reinforcement learning algorithm for continuous control, which extends batch ...
Kim, Hyunwoo, Lee, Hyo Kyung
openaire +2 more sources
In Reinforcement Learning the Q-learning algorithm provably converges to the optimal solution. However, as others have demonstrated, Q-learning can also overestimate the values and thereby spend too long exploring unhelpful states. Double Q-learning is a provably convergent alternative that mitigates some of the overestimation issues, though sometimes ...
openaire +2 more sources

