SuperGLUE: A Stickier Benchmark for General-Purpose Language Understanding Systems | Alex Wang · Yada Pruksachatkun · Nikita Nangia · Amanpreet Singh · Julian Michael · Felix Hill · Omer Levy · Samuel Bowman |
A Tensorized Transformer for Language Modeling | Xindian Ma · Peng Zhang · Shuai Zhang · Nan Duan · Yuexian Hou · Ming Zhou · Dawei Song |
AttentionXML: Label Tree-based Attention-Aware Deep Model for High-Performance Extreme Multi-Label Text Classification | Ronghui You · Zihan Zhang · Ziye Wang · Suyang Dai · Hiroshi Mamitsuka · Shanfeng Zhu |
Comparing Unsupervised Word Translation Methods Step by Step | Mareike Hartmann · Yova Kementchedjhieva · Anders Søgaard |
Glyce: Glyph-vectors for Chinese Character Representations | Yuxian Meng · Wei Wu · Fei Wang · Xiaoya Li · Ping Nie · Fan Yin · Muyu Li · Qinghong Han · Yuxian Meng · Jiwei Li |
Hierarchical Optimal Transport for Document Representation | Mikhail Yurochkin · Sebastian Claici · Edward Chien · Farzaneh Mirzazadeh · Justin M Solomon |
Improving Textual Network Learning with Variational Homophilic Embeddings | Wenlin Wang · Chenyang Tao · Zhe Gan · Guoyin Wang · Liqun Chen · Xinyuan Zhang · Ruiyi Zhang · Qian Yang · Ricardo Henao · Lawrence Carin |
Ouroboros: On Accelerating Training of Transformer-Based Language Models | Qian Yang · Zhouyuan Huo · Wenlin Wang · Lawrence Carin |
Fast Structured Decoding for Sequence Models | Zhiqing Sun · Zhuohan Li · Haoqing Wang · Di He · Zi Lin · Zhihong Deng |
Can Unconditional Language Models Recover Arbitrary Sentences? | Nishant Subramani · Samuel Bowman · Kyunghyun Cho |
Controllable Unsupervised Text Attribute Transfer via Editing Entangled Latent Representation | Ke Wang · Hang Hua · Xiaojun Wan |
Defending Against Neural Fake News | Rowan Zellers · Ari Holtzman · Hannah Rashkin · Yonatan Bisk · Ali Farhadi · Franziska Roesner · Yejin Choi |
Interpreting and improving natural-language processing (in machines) with natural language-processing (in the brain) | Mariya Toneva · Leila Wehbe |
Invariance and identifiability issues for word embeddings | Rachel Carrington · Karthik Bharath · Simon Preston |
Kernelized Bayesian Softmax for Text Generation | Ning Miao · Hao Zhou · Chengqi Zhao · Wenxian Shi · Lei Li |
Levenshtein Transformer | Jiatao Gu · Changhan Wang · Junbo Zhao |
Neural Machine Translation with Soft Prototype | Yiren Wang · Yingce Xia · Fei Tian · Fei Gao · Tao Qin · Cheng Xiang Zhai · Tie-Yan Liu |
Paraphrase Generation with Latent Bag of Words | Yao Fu · Yansong Feng · John Cunningham |
Unified Language Model Pre-training for Natural Language Understanding and Generation | Li Dong · Nan Yang · Wenhui Wang · Furu Wei · Xiaodong Liu · Yu Wang · Jianfeng Gao · Ming Zhou · Hsiao-Wuen Hon |
XLNet: Generalized Autoregressive Pretraining for Language Understanding | Zhilin Yang · Zihang Dai · Yiming Yang · Jaime Carbonell · Russ Salakhutdinov · Quoc V Le |