Policy search - Open Access .click

Results 91 to 100 of about 2,392,451 (238)

When users work with AI agents, they form conscious or subconscious expectations of them. Meeting user expectations is crucial for such agents to engage in successful interactions and teaming. However, users may form expectations of an agent that differ from the agent's planned behaviors.
Hanni, Akkamahadevi, Montaño, Jonathan, Zhang, Yu +2 more
openaire +2 more sources

Bayesian Policy Search with Policy Priors

, 2011
United States. Air Force Office of Scientific Research (FA9550-07-1-0075)
Wingate, David +4 more
openaire +2 more sources

The elusive search for the public voice in health policy: the case for ‘systems thinking’ [PDF]

, 2017
John Boswell
openalex +1 more source

A Monetary Theory with Non-Degenerate Distributions [PDF]

Dispersion of money balances among individuals is the basis for a range of policies but it has been abstracted from in monetary theory for tractability reasons.
Guido Menzio, Hongfei Sun, Shouyong Shi
core

Optimal unemployment insurance with monitoring and sanctions [PDF]

This paper analyzes the design of optimal unemployment insurance in a search equilibrium framework where search effort among the unemployed is not perfectly observable. We examine to what extent the optimal policy involves monitoring of search effort and
Boone, Jan +3 more
core

Optimized look‐ahead tree policies: a bridge between look‐ahead tree policies and direct policy search [PDF]

, 2013
Tobias Jung +3 more
openalex +1 more source

Bayesian Nonparametric Policy Search with Application to Periodontal Recall Intervals. [PDF]

J Am Stat Assoc, 2020
Guan Q, Reich BJ, Laber EB, Bandyopadhyay D. +3 more
europepmc +1 more source

Task Feasibility Maximization Using Model-Free Policy Search and Model-Based Whole-Body Control. [PDF]

Front Robot AI, 2020
Lober R, Sigaud O, Padois V.
europepmc +1 more source

PILCO: A Model-Based and Data-Efficient Approach to Policy Search

, 2011
In this paper, we introduce PILCO, a practical, data-efficient model-based policy search method. PILCO reduces model bias, one of the key problems of model-based reinforcement learning, in a principled way.
Deisenroth, MP, Rasmussen, CE
core +3 more sources

Data-Efficient Policy Evaluation Through Behavior Policy Search

, 2017
We consider the task of evaluating a policy for a Markov decision process (MDP). The standard unbiased technique for evaluating a policy is to deploy the policy and observe its performance. We show that the data collected from deploying a different policy, commonly called the behavior policy, can be used to produce unbiased estimates with lower mean ...
Hanna, Josiah P. +3 more
openaire +2 more sources

computer science
artificial intelligence
political science

reinforcement learning
machine learning
law

mathematics
economics
fos: computer and information sciences