| A Fourier Perspective on Model Robustness in Computer Vision | Dong Yin · Raphael Gontijo Lopes · Jon Shlens · Ekin Dogus Cubuk · Justin Gilmer |
| A Graph Theoretic Framework of Recomputation Algorithms for Memory-Efficient Backpropagation | Mitsuru Kusumoto · Takuya Inoue · Gentaro Watanabe · Takuya Akiba · Masanori Koyama |
| A Mean Field Theory of Quantized Deep Networks: The Quantization-Depth Trade-Off | Yaniv Blumenfeld · Dar Gilboa · Daniel Soudry |
| AutoAssist: A Framework to Accelerate Training of Deep Neural Networks | Jiong Zhang · Hsiang-Fu Yu · Inderjit S Dhillon |
| Backprop with Approximate Activations for Memory-efficient Network Training | Ayan Chakrabarti · Benjamin Moseley |
| Bridging Machine Learning and Logical Reasoning by Abductive Learning | Wang-Zhou Dai · Qiuling Xu · Yang Yu · Zhi-Hua Zhou |
| E2-Train: Training State-of-the-art CNNs with Over 80% Less Energy | Ziyu Jiang · Yue Wang · Xiaohan Chen · Pengfei Xu · Yang Zhao · Yingyan Lin · Zhangyang Wang |
| Hybrid 8-bit Floating Point (HFP8) Training and Inference for Deep Neural Networks | Xiao Sun · Jungwook Choi · Chia-Yu Chen · Naigang Wang · Swagath Venkataramani · Vijayalakshmi (Viji) Srinivasan · Xiaodong Cui · Wei Zhang · Kailash Gopalakrishnan |
| Initialization of ReLUs for Dynamical Isometry | Rebekka Burkholz · Alina Dubatovka |
| Invert to Learn to Invert | Patrick Putzky · Max Welling |
| Learning Data Manipulation for Augmentation and Weighting | Zhiting Hu · Bowen Tan · Russ Salakhutdinov · Tom Mitchell · Eric Xing |
| Robust Bi-Tempered Logistic Loss Based on Bregman Divergences | Ehsan Amid · Manfred K. Warmuth · Rohan Anil · Tomer Koren |
| When does label smoothing help? | Rafael Müller · Simon Kornblith · Geoffrey E Hinton |