Results 61 to 70 of about 4,474,845 (304)
A Q‐Learning Algorithm to Solve the Two‐Player Zero‐Sum Game Problem for Nonlinear Systems
A Q‐learning algorithm to solve the two‐player zero‐sum game problem for nonlinear systems. ABSTRACT This paper deals with the two‐player zero‐sum game problem, which is a bounded L2$$ {L}_2 $$‐gain robust control problem. Finding an analytical solution to the complex Hamilton‐Jacobi‐Issacs (HJI) equation is a challenging task.
Afreen Islam +2 more
wiley +1 more source
Asymptotic Optimality of a Time Optimal Path Parametrization Algorithm
Time Optimal Path Parametrization is the problem of minimizing the time interval during which an actuation constrained agent can traverse a given path. Recently, an efficient linear-time algorithm for solving this problem was proposed.
Karaman, Sertac +2 more
core +1 more source
Optimistic Agents Are Asymptotically Optimal [PDF]
We use optimism to introduce generic asymptotically optimal reinforcement learning agents. They achieve, with an arbitrary finite or compact class of environments, asymptotically optimal behavior. Furthermore, in the finite deterministic case we provide finite error bounds.
Sunehag, Peter, Hutter, Marcus
openaire +2 more sources
This paper proposes two projector‐based Hopfield neural network (HNN) estimators for online, constrained parameter estimation under time‐varying data, additive disturbances, and slowly drifting physical parameters. The first is a constraint‐aware HNN that enforces linear equalities and inequalities (via slack neurons) and continuously tracks the ...
Miguel Pedro Silva
wiley +1 more source
Cross-Validation Model Averaging for Generalized Functional Linear Model
Functional data is a common and important type in econometrics and has been easier and easier to collect in the big data era. To improve estimation accuracy and reduce forecast risks with functional data, in this paper, we propose a novel cross ...
Haili Zhang, Guohua Zou
doaj +1 more source
Convergence rate of linear two-time-scale stochastic approximation
We study the rate of convergence of linear two-time-scale stochastic approximation methods. We consider two-time-scale linear iterations driven by i.i.d. noise, prove some results on their asymptotic covariance and establish asymptotic normality.
Konda, Vijay R., Tsitsiklis, John N.
core +1 more source
On Asymptotically Optimal Tests
Sequences of tests with error $\exp(-nA)$ of the first type are investigated. It is shown that the error of the second type of such a sequence of tests is bounded by $\exp(- nB)$ where $B$ is determined by the Kullback-Leibler information distance of the hypotheses tested.
openaire +3 more sources
Current Tracking Adaptive Control of Brushless DC Motors
In this paper, the current tracking for Brushless Direct Current motors is approached considering uncertainty in the parameters of the motor's model. An adaptive control scheme to compensate electrical parameters uncertainty is proposed without requiring any knowledge of the mechanical parameters.
Fernanda Ramos‐García +3 more
wiley +1 more source
Optimality of Thompson Sampling for Gaussian Bandits Depends on Priors [PDF]
In stochastic bandit problems, a Bayesian policy called Thompson sampling (TS) has recently attracted much attention for its excellent empirical performance.
Akimichi Takemura +3 more
core +1 more source
Asymptotically Optimal Weighted Numerical Integration
We study numerical integration of Hölder-type functions with respect to weights on the real line. Our study extends previous work by F. Curbera, [2] and relies on a connection between this problem and the approximation of distribution functions by empirical ones.
openaire +3 more sources

