Can SGD Learn Recurrent Neural Networks with Provable Generalization? | Zeyuan Allen-Zhu · Yuanzhi Li |
Input-Cell Attention Reduces Vanishing Saliency of Recurrent Neural Networks | Aya Abdelsalam Ismail · Mohamed Gunady · Luiz Pessoa · Hector Corrada Bravo · Soheil Feizi |
Input-Output Equivalence of Unitary and Contractive RNNs | Melikasadat Emami · Mojtaba Sahraee Ardakan · Sundeep Rangan · Alyson Fletcher |
Kernel-Based Approaches for Sequence Modeling: Connections to Neural Methods | Kevin Liang · Guoyin Wang · Yitong Li · Ricardo Henao · Lawrence Carin |
Legendre Memory Units: Continuous-Time Representation in Recurrent Neural Networks | Aaron Voelker · Ivana Kajić · Chris Eliasmith |
Non-normal Recurrent Neural Network (nnRNN): learning long time dependencies while improving expressivity with transient dynamics | Giancarlo Kerg · Kyle Goyette · Maximilian Puelma Touzel · Gauthier Gidel · Eugene Vorontsov · Yoshua Bengio · Guillaume Lajoie |
Reverse engineering recurrent networks for sentiment classification reveals line attractor dynamics | Niru Maheswaranathan · Alex Williams · Matthew Golub · Surya Ganguli · David Sussillo |
Root Mean Square Layer Normalization | Biao Zhang · Rico Sennrich |
Universal Approximation of Input-Output Maps by Temporal Convolutional Nets | Joshua Hanson · Maxim Raginsky |