Faster width-dependent algorithm for mixed packing and covering LPs | Digvijay Boob · Saurabh Sawlani · Di Wang |
First order expansion of convex regularized estimators | Pierre Bellec · Arun Kuchibhotla |
On the number of variables to use in principal component regression | Ji Xu · Daniel Hsu |
Implicit Regularization for Optimal Sparse Recovery | Tomas Vaskevicius · Varun Kanade · Patrick Rebeschini |
The Convergence Rate of Neural Networks for Learned Functions of Different Frequencies | Basri Ronen · David Jacobs · Yoni Kasten · Shira Kritchman |
How degenerate is the parametrization of neural networks with the ReLU activation function? | Dennis Maximilian Elbrächter · Julius Berner · Philipp Grohs |
Implicit Regularization in Deep Matrix Factorization | Sanjeev Arora · Nadav Cohen · Wei Hu · Yuping Luo |
The Impact of Regularization on High-dimensional Logistic Regression | Fariborz Salehi · Ehsan Abbasi · Babak Hassibi |
The Implicit Bias of AdaGrad on Separable Data | Qian Qian · Xiaoyuan Qian |
Time Matters in Regularizing Deep Networks: Weight Decay and Data Augmentation Affect Early Learning Dynamics, Matter Little Near Convergence | Aditya Sharad Golatkar · Alessandro Achille · Stefano Soatto |