Results 61 to 70 of about 141,184 (170)
Stochastic Compositional Gradient Descent Under Compositional Constraints
A part of this work is submitted in Asilomar Conference on Signals, Systems, and ...
Srujan Teja Thomdapu +2 more
openaire +2 more sources
Attentional-Biased Stochastic Gradient Descent
In this paper, we present a simple yet effective provable method (named ABSGD) for addressing the data imbalance or label noise problem in deep learning. Our method is a simple modification to momentum SGD where we assign an individual importance weight to each sample in the mini-batch.
Qi, Qi +4 more
openaire +2 more sources
Why random reshuffling beats stochastic gradient descent [PDF]
We analyze the convergence rate of the random reshuffling (RR) method, which is a randomized first-order incremental algorithm for minimizing a finite sum of convex component functions. RR proceeds in cycles, picking a uniformly random order (permutation) and processing the component functions one at a time according to this order, i.e., at each cycle,
M. Gürbüzbalaban +2 more
openaire +4 more sources
The uncertainty in the new power system has increased, leading to limitations in traditional stability analysis methods. Therefore, considering the perspective of the three-dimensional static security region (SSR), we propose a novel approach for system ...
Jiahui Wu +3 more
doaj +1 more source
Stochastic Adaptive Gradient Descent Without Descent
We introduce a new adaptive step-size strategy for convex optimization with stochastic gradient that exploits the local geometry of the objective function only by means of a first-order stochastic oracle and without any hyper-parameter tuning. The method comes from a theoretically-grounded adaptation of the Adaptive Gradient Descent Without Descent ...
Aujol, Jean-François +2 more
openaire +2 more sources
Federated Accelerated Stochastic Gradient Descent
Accepted to NeurIPS 2020. Best paper in International Workshop on Federated Learning for User Privacy and Data Confidentiality in Conjunction with ICML 2020 (FL-ICML'20).
Yuan, Honglin, Ma, Tengyu
openaire +2 more sources
Beyond Convexity: Stochastic Quasi-Convex Optimization
Stochastic convex optimization is a basic and well studied primitive in machine learning. It is well known that convex and Lipschitz functions can be minimized efficiently using Stochastic Gradient Descent (SGD).
Hazan, Elad +2 more
core
Stochastic Modified Flows for Riemannian Stochastic Gradient Descent
We give quantitative estimates for the rate of convergence of Riemannian stochastic gradient descent (RSGD) to Riemannian gradient flow and to a diffusion process, the so-called Riemannian stochastic modified flow (RSMF). Using tools from stochastic differential geometry we show that, in the small learning rate regime, RSGD can be approximated by the ...
Benjamin Gess +2 more
openaire +4 more sources
Adam Algorithm with Step Adaptation
Adam (Adaptive Moment Estimation) is a well-known algorithm for the first-order gradient-based optimization of stochastic objective functions, based on adaptive estimates of lower-order moments.
Vladimir Krutikov +2 more
doaj +1 more source
Parle: parallelizing stochastic gradient descent
We propose a new algorithm called Parle for parallel training of deep networks that converges 2-4x faster than a data-parallel implementation of SGD, while achieving significantly improved error rates that are nearly state-of-the-art on several benchmarks including CIFAR-10 and CIFAR-100, without introducing any additional hyper-parameters.
Chaudhari, Pratik +5 more
openaire +2 more sources

