Results 191 to 200 of about 2,392,451 (238)
Consumers' psychological constructs regarding hybrid meat products: A scoping review protocol. [PDF]
de Veld J, Pekdemir C, Ten Hoor G.
europepmc +1 more source
Policy evaluation frameworks for rare diseases: a scoping review. [PDF]
Çakmak Barsbay M, Aydamak MY.
europepmc +1 more source
Excess mortality in Mainland China after the end of the Zero COVID policy: A systematic review. [PDF]
Fung IC +10 more
europepmc +1 more source
Growth Management and Constitutional Rights Part II: The States Search for a Growth Policy
Fred P. Bosselman
openalex +1 more source
Some of the next articles are maybe not open access.
Related searches:
Related searches:
2011 IEEE Symposium on Adaptive Dynamic Programming and Reinforcement Learning (ADPRL), 2011
In this paper we address the combination of batch reinforcement-learning (BRL) techniques with direct policy search (DPS) algorithms in the context of robot learning. Batch value-based algorithms (such as fitted Q-iteration) have been proved to outperform online ones in many complex applications, but they share the same difficulties in solving problems
MIGLIAVACCA, MARTINO +4 more
openaire +1 more source
In this paper we address the combination of batch reinforcement-learning (BRL) techniques with direct policy search (DPS) algorithms in the context of robot learning. Batch value-based algorithms (such as fitted Q-iteration) have been proved to outperform online ones in many complex applications, but they share the same difficulties in solving problems
MIGLIAVACCA, MARTINO +4 more
openaire +1 more source
2003
We investigate the problem of non-covariant behavior of policy gradient reinforcement learning algorithms. The policy gradient approach is amenable to analysis by information geometric methods. This leads us to propose a natural metric on controller parameterization that results from considering the manifold of probability distributions over paths ...
J. Andrew Bagnell, Schneider, Jeff
openaire +1 more source
We investigate the problem of non-covariant behavior of policy gradient reinforcement learning algorithms. The policy gradient approach is amenable to analysis by information geometric methods. This leads us to propose a natural metric on controller parameterization that results from considering the manifold of probability distributions over paths ...
J. Andrew Bagnell, Schneider, Jeff
openaire +1 more source
Nonconvex Policy Search Using Variational Inequalities
Neural Computation, 2017Policy search is a class of reinforcement learning algorithms for finding optimal policies in control problems with limited feedback. These methods have been shown to be successful in high-dimensional problems such as robotics control. Though successful, current methods can lead to unsafe policy parameters that potentially could damage hardware units.
Zhan, Yusen +2 more
openaire +3 more sources
Policy Search by Dynamic Programming
2003We consider the policy search approach to reinforcement learning. We show that if a “baseline distribution” is given (indicating roughly how often we expect a good policy to visit each state), then we can derive a policy search algorithm that terminates in a finite number of steps, and for which we can provide non-trivial performance guarantees.
J. Andrew Bagnell +3 more
openaire +1 more source

