Asymmetric Valleys: Beyond Sharp and Flat Local Minima | Haowei He · Gao Huang · Yang Yuan |
Beyond Alternating Updates for Matrix Factorization with Inertial Bregman Proximal Gradient Algorithms | Mahesh Chandra Mukkamala · Peter Ochs |
Efficiently escaping saddle points on manifolds | Christopher Criscitiello · Nicolas Boumal |
Global Convergence of Least Squares EM for Demixing Two Log-Concave Densities | Wei Qian · Yuqian Zhang · Yudong Chen |
Learning dynamic polynomial proofs | Alhussein Fawzi · Mateusz Malinowski · Hamza Fawzi · Omar Fawzi |
Non-asymptotic Analysis of Stochastic Methods for Non-Smooth Non-Convex Regularized Problems | Yi Xu · Rong Jin · Tianbao Yang |
Nonconvex Low-Rank Tensor Completion from Noisy Data | Changxiao Cai · Gen Li · H. Vincent Poor · Yuxin Chen |
Painless Stochastic Gradient: Interpolation, Line-Search, and Convergence Rates | Sharan Vaswani · Aaron Mishkin · Issam Laradji · Mark Schmidt · Gauthier Gidel · Simon Lacoste-Julien |
SpiderBoost and Momentum: Faster Variance Reduction Algorithms | Zhe Wang · Kaiyi Ji · Yi Zhou · Yingbin Liang · Vahid Tarokh |
SSRGD: Simple Stochastic Recursive Gradient Descent for Escaping Saddle Points | Zhize Li |
The Landscape of Non-convex Empirical Risk with Degenerate Population Risk | Shuang Li · Gongguo Tang · Michael B Wakin |
Who is Afraid of Big Bad Minima? Analysis of gradient-flow in spiked matrix-tensor models | Stefano Sarao Mannelli · Giulio Biroli · Chiara Cammarota · Florent Krzakala · Lenka Zdeborová |
A Linearly Convergent Method for Non-Smooth Non-Convex Optimization on the Grassmannian with Applications to Robust Subspace and Dictionary Learning | Zhihui Zhu · Tianyu Ding · Daniel Robinson · Manolis Tsakiris · RenĂ© Vidal |
Competitive Gradient Descent | Florian Schaefer · Anima Anandkumar |
DINGO: Distributed Newton-Type Method for Gradient-Norm Optimization | Rixon Crane · Fred Roosta |
Efficient Smooth Non-Convex Stochastic Compositional Optimization via Stochastic Recursive Gradient Descent | Huizhuo Yuan · Xiangru Lian · Chris Junchi Li · Ji Liu · Wenqing Hu |
Efficiently avoiding saddle points with zero order methods: No gradients required | Emmanouil-Vasileios Vlatakis-Gkaragkounis · Lampros Flokas · Georgios Piliouras |
Escaping from saddle points on Riemannian manifolds | Yue Sun · Nicolas Flammarion · Maryam Fazel |
Exponentially convergent stochastic k-PCA without variance reduction | Cheng Tang |
First-order methods almost always avoid saddle points: The case of vanishing step-sizes | Ioannis Panageas · Georgios Piliouras · Xiao Wang |
Learning Sparse Distributions using Iterative Hard Thresholding | Jacky Y Zhang · Rajiv Khanna · Anastasios Kyrillidis · Oluwasanmi Koyejo |
Local SGD with Periodic Averaging: Tighter Analysis and Adaptive Synchronization | Farzin Haddadpour · Mohammad Mahdi Kamani · Mehrdad Mahdavi · Viveck Cadambe |
Max-value Entropy Search for Multi-Objective Bayesian Optimization | Syrine Belakaria · Aryan Deshwal · Janardhan Rao Doppa |
Solving a Class of Non-Convex Min-Max Games Using Iterative First Order Methods | Maher Nouiehed · Maziar Sanjabi · Tianjian Huang · Jason Lee · Meisam Razaviyayn |
A Nonconvex Approach for Exact and Efficient Multichannel Sparse Blind Deconvolution | Qing Qu · Xiao Li · Zhihui Zhu |
An Inexact Augmented Lagrangian Framework for Nonconvex Optimization with Nonlinear Constraints | Mehmet Fatih Sahin · Armin eftekhari · Ahmet Alacaoglu · Fabian Latorre · Volkan Cevher |
Bayesian Optimization with Unknown Search Space | Huong Ha · Santu Rana · Sunil Gupta · Thanh Nguyen · Hung Tran-The · Svetha Venkatesh |
Calculating Optimistic Likelihoods Using (Geodesically) Convex Optimization | Viet Anh Nguyen · Soroosh Shafieezadeh Abadeh · Man-Chung Yue · Daniel Kuhn · Wolfram Wiesemann |
Communication-Efficient Distributed Blockwise Momentum SGD with Error-Feedback | Shuai Zheng · Ziyue Huang · James Kwok |
Distributed Low-rank Matrix Factorization With Exact Consensus | Zhihui Zhu · Qiuwei Li · Xinshuo Yang · Gongguo Tang · Michael B Wakin |
Efficient Algorithms for Smooth Minimax Optimization | Kiran Thekumparampil · Prateek Jain · Praneeth Netrapalli · Sewoong Oh |
Momentum-Based Variance Reduction in Non-Convex SGD | Ashok Cutkosky · Francesco Orabona |
Provable Non-linear Inductive Matrix Completion | Kai Zhong · Zhao Song · Prateek Jain · Inderjit S Dhillon |
Semi-flat minima and saddle points by embedding neural networks to overparameterization | Kenji Fukumizu · Shoichiro Yamaguchi · Yoh-ichi Mototake · Mirai Tanaka |
Shadowing Properties of Optimization Algorithms | Antonio Orvieto · Aurelien Lucchi |