Results 91 to 100 of about 2,392,451 (238)
When users work with AI agents, they form conscious or subconscious expectations of them. Meeting user expectations is crucial for such agents to engage in successful interactions and teaming. However, users may form expectations of an agent that differ from the agent's planned behaviors.
Hanni, Akkamahadevi +2 more
openaire +2 more sources
Bayesian Policy Search with Policy Priors
United States. Air Force Office of Scientific Research (FA9550-07-1-0075)
Wingate, David +4 more
openaire +2 more sources
The elusive search for the public voice in health policy: the case for ‘systems thinking’ [PDF]
John Boswell
openalex +1 more source
A Monetary Theory with Non-Degenerate Distributions [PDF]
Dispersion of money balances among individuals is the basis for a range of policies but it has been abstracted from in monetary theory for tractability reasons.
Guido Menzio, Hongfei Sun, Shouyong Shi
core
Optimal unemployment insurance with monitoring and sanctions [PDF]
This paper analyzes the design of optimal unemployment insurance in a search equilibrium framework where search effort among the unemployed is not perfectly observable. We examine to what extent the optimal policy involves monitoring of search effort and
Boone, Jan +3 more
core
Optimized look‐ahead tree policies: a bridge between look‐ahead tree policies and direct policy search [PDF]
Tobias Jung +3 more
openalex +1 more source
Bayesian Nonparametric Policy Search with Application to Periodontal Recall Intervals. [PDF]
Guan Q +3 more
europepmc +1 more source
Task Feasibility Maximization Using Model-Free Policy Search and Model-Based Whole-Body Control. [PDF]
Lober R, Sigaud O, Padois V.
europepmc +1 more source
PILCO: A Model-Based and Data-Efficient Approach to Policy Search
In this paper, we introduce PILCO, a practical, data-efficient model-based policy search method. PILCO reduces model bias, one of the key problems of model-based reinforcement learning, in a principled way.
Deisenroth, MP, Rasmussen, CE
core +3 more sources
Data-Efficient Policy Evaluation Through Behavior Policy Search
We consider the task of evaluating a policy for a Markov decision process (MDP). The standard unbiased technique for evaluating a policy is to deploy the policy and observe its performance. We show that the data collected from deploying a different policy, commonly called the behavior policy, can be used to produce unbiased estimates with lower mean ...
Hanna, Josiah P. +3 more
openaire +2 more sources

