抽空为大家整理了人工智能顶会ICLR 2020录用的自然语言处理相关的最新论文,内容涉及到知识图谱、语言建模、文本生成、机器翻译等热门领域,还有几篇关于BERT、Transformer模型优化的文章。感兴趣的朋友们赶紧Mark读起来吧!
Reducing Transformer Depth on Demand with Structured Dropout
链接 | https://openreview.net/pdf?id=SylO2yStDr
作者 | Angela Fan, Edouard Grave, Armand Joulin
单位 | Facebook AI Research
DeFINE: Deep Factorized Input Token Embeddings for Neural Sequence Modeling
链接 | https://openreview.net/pdf?id=rJeXS04FPH
作者 | Sachin Mehta, Rik Koncel-Kedziorski, Mohammad Rastegari, Hannaneh Hajishirzi
单位 | University of Washington; Allen Institute for AI
Understanding Knowledge Distillation in Non-autoregressive Machine Translation
链接 | https://openreview.net/pdf?id=BygFVAEKDH
作者 | Chunting Zhou, Jiatao Gu, Graham Neubig
单位 | Carnegie Mellon University; Facebook AI Research
Encoding word order in complex embeddings
链接 | https://openreview.net/pdf?id=Hke-WTVtwr
作者 | Benyou Wang, Donghao Zhao, Christina Lioma, Qiuchi Li, Peng Zhang, Jakob Grue Simonsen
BERTScore: Evaluating Text Generation with BERT
链接 | https://openreview.net/pdf?id=SkeHuCVFDr
作者 | Tianyi Zhang, Varsha Kishore, Felix Wu, Kilian Q. Weinberger, Yoav Artzi
单位 | Cornell University
Are Transformers universal approximators of sequence-to-sequence functions?
链接 | https://openreview.net/pdf?id=ByxRM0Ntvr
作者 | Chulhee Yun, Srinadh Bhojanapalli, Ankit Singh Rawat, Sashank Reddi, Sanjiv Kumar
单位 | MIT; Google Research
Language GANs Falling Short
链接 | https://openreview.net/pdf?id=BJgza6VtPB
作者 | Massimo Caccia, Lucas Caccia, William Fedus, Hugo Larochelle, Joelle Pineau, Laurent Charlin
单位 | MILA; McGill University
Are Pre-trained Language Models Aware of Phrases? Simple but Strong Baselines for Grammar Induction
链接 | https://openreview.net/pdf?id=H1xPR3NtPB
作者 | Taeuk Kim, Jihun Choi, Daniel Edmiston, Sang-goo Lee
单位 | University of Chicago; Seoul National University
Dynamically Pruned Message Passing Networks for Large-scale Knowledge Graph Reasoning
链接 | https://openreview.net/pdf?id=rkeuAhVKvB
作者 | Xiaoran Xu, Wei Feng, Yunsheng Jiang, Xiaohui Xie, Zhiqing Sun, Zhi-Hong Deng
单位 | Carnegie Mellon University; Peking University
Reinforcement Learning Based Graph-to-Sequence Model for Natural Question Generation
链接 | https://openreview.net/pdf?id=HygnDhEtvr
作者 | Yu Chen, Lingfei Wu, Mohammed J. Zaki
单位 | Rensselaer Polytechnic Institute; IBM Research
Compressive Transformers for Long-Range Sequence Modelling
链接 | https://openreview.net/pdf?id=SylKikSYDH
作者 | Jack W. Rae, Anna Potapenko, Siddhant M. Jayakumar, Chloe Hillier, Timothy P. Lillicrap
Incorporating BERT into Neural Machine Translation
链接 | https://openreview.net/pdf?id=Hyl7ygStwB
作者 | Jinhua Zhu, Yingce Xia, Lijun Wu, Di He, Tao Qin, Wengang Zhou, Houqiang Li, Tieyan Liu
单位 | University of Science and Technology of China; Microsoft Research; Sun Yat-sen University; Peking University
Robustness Verification for Transformers
链接 | https://openreview.net/pdf?id=BJxwPJHFwS
作者 | Zhouxing Shi, Huan Zhang, Kai-Wei Chang, Minlie Huang, Cho-Jui Hsieh
单位 | Tsinghua University; University of California, Los Angeles
Emergence of functional and structural properties of the head direction system by optimization of recurrent neural networks
链接 | https://openreview.net/pdf?id=HklSeREtPB
作者 | Christopher J. Cueva, Peter Y. Wang, Matthew Chin, Xue-Xin Wei
单位 | Columbia University
A Mutual Information Maximization Perspective of Language Representation Learning
链接 | https://openreview.net/pdf?id=Syx79eBKwr
作者 | Lingpeng Kong, Cyprien de Masson d’Autume, Lei Yu, Wang Ling, Zihang Dai, Dani Yogatama
单位 | DeepMind; Carnegie Mellon University; Google Brain
word2ket: Space-efficient Word Embeddings inspired by Quantum Entanglement
链接 | https://openreview.net/pdf?id=HkxARkrFwB
作者 | Aliakbar Panahi, Seyran Saeedi, Tom Arodz
Towards Hierarchical Importance Attribution: Explaining Compositional Semantics for Neural Sequence Models
链接 | https://openreview.net/pdf?id=BkxRRkSKwr
作者 | Xisen Jin, Zhongyu Wei, Junyi Du, Xiangyang Xue, Xiang Ren
单位 | University of Southern California; Fudan University
Sequential Latent Knowledge Selection for Knowledge-Grounded Dialogue
链接 | https://openreview.net/pdf?id=Hke0K1HKwr
作者 | Byeongchang Kim, Jaewoo Ahn, Gunhee Kim
单位 | Seoul National University
A Latent Morphology Model for Open-Vocabulary Neural Machine Translation
链接 | https://openreview.net/pdf?id=BJxSI1SKDH
作者 | Duygu Ataman, Wilker Aziz, Alexandra Birch
单位 | University of Zurich; University of Amsterdam; University of Edinburgh
FreeLB: Enhanced Adversarial Training for Natural Language Understanding
链接 | https://openreview.net/pdf?id=BygzbyHFvB
作者 | Chen Zhu, Yu Cheng, Zhe Gan, Siqi Sun, Tom Goldstein, Jingjing Liu
单位 | University of Maryland, College Park; Microsoft
A Probabilistic Formulation of Unsupervised Text Style Transfer
链接 | https://openreview.net/pdf?id=HJlA0C4tPS
作者 | Junxian He, Xinyi Wang, Graham Neubig, Taylor Berg-Kirkpatrick
单位 | Carnegie Mellon University; University of California San Diego
Neural Symbolic Reader: Scalable Integration of Distributed and Symbolic Representations for Reading Comprehension
链接 | https://openreview.net/pdf?id=ryxjnREFwH
作者 | Xinyun Chen, Chen Liang, Adams Wei Yu, Denny Zhou, Dawn Song, Quoc V. Le
单位 | UC Berkeley; Google Brain
ALBERT: A Lite BERT for Self-supervised Learning of Language Representations
链接 | https://openreview.net/pdf?id=H1eA7AEtvS
作者 | Zhenzhong Lan, Mingda Chen, Sebastian Goodman, Kevin Gimpel, Piyush Sharma, Radu Soricut
单位 | Google Research; Toyota Technological Institute at Chicago
Neural Machine Translation with Universal Visual Representation
链接 | https://openreview.net/pdf?id=Byl8hhNYPS
作者 | Zhuosheng Zhang, Kehai Chen, Rui Wang, Masao Utiyama, Eiichiro Sumita, Zuchao Li, Hai Zhao
单位 | Shanghai Jiao Tong University
Data-dependent Gaussian Prior Objective for Language Generation
链接 | https://openreview.net/pdf?id=S1efxTVYDr
作者 | Zuchao Li, Rui Wang, Kehai Chen, Masso Utiyama, Eiichiro Sumita, Zhuosheng Zhang, Hai Zhao
单位 | Shanghai Jiao Tong University
Mogrifier LSTM
链接 | https://openreview.net/pdf?id=SJe5P6EYvS
作者 | Gábor Melis, Tomáš Kočiský, Phil Blunsom
单位 | DeepMind; University of Oxford
Mirror-Generative Neural Machine Translation
链接 | https://openreview.net/pdf?id=HkxQRTNYPH
作者 | Zaixiang Zheng, Hao Zhou, Shujian Huang, Lei Li, Xin-Yu Dai, Jiajun Chen
单位 | Nanjing University
Reformer: The Efficient Transformer
链接 | https://openreview.net/pdf?id=rkgNKkHtvB
作者 | Nikita Kitaev, Lukasz Kaiser, Anselm Levskaya
单位 | UC Berkeley; Google Research
Large Batch Optimization for Deep Learning: Training BERT in 76 minutes
链接 | https://openreview.net/pdf?id=Syx4wnEtvH
作者 | Yang You, Jing Li, Sashank Reddi, Jonathan Hseu, Sanjiv Kumar, Srinadh Bhojanapalli, Xiaodan Song, James Demmel, Kurt Keutzer, Cho-Jui Hsieh
单位 | Google; UC Berkeley; UCLA
Reinforcement Learning Based Graph-to-Sequence Model for Natural Question Generation
链接 | https://openreview.net/pdf?id=HygnDhEtvr
作者 | Yu Chen, Lingfei Wu, Mohammed J. Zaki
单位 | Rensselaer Polytechnic Institute; IBM Research
On the Weaknesses of Reinforcement Learning for Neural Machine Translation
链接 | https://openreview.net/pdf?id=H1eCw3EKvH
作者 | Leshem Choshen, Lior Fox, Zohar Aizenbud, Omri Abend
Depth-Adaptive Transformer
链接 | https://openreview.net/pdf?id=SJg7KhVKPH
作者 | Maha Elbayad, Jiatao Gu, Edouard Grave, Michael Auli
单位 | Facebook AI Research
LAMOL: LAnguage MOdeling for Lifelong Language Learning
链接 | https://openreview.net/pdf?id=Skgxcn4YDS
作者 | Fan-Keng Sun, Cheng-Hao Ho, Hung-Yi Lee
单位 | MIT; National * University
TabFact: A Large-scale Dataset for Table-based Fact Verification
链接 | https://openreview.net/pdf?id=rkeJRhNYDH
作者 | Wenhu Chen, Hongmin Wang, Jianshu Chen, Yunkai Zhang, Hong Wang, Shiyang Li, Xiyou Zhou, William Yang Wang
On Identifiability in Transformers
链接 | https://openreview.net/pdf?id=BJg1f6EFDB
作者 | Gino Brunner, Yang Liu, Damian Pascual, Oliver Richter, Massimiliano Ciaramita, Roger Wattenhofer
单位 | ETH Zurich; Google Research
Few-shot Text Classification with Distributional Signatures
链接 | https://openreview.net/pdf?id=H1emfT4twB
作者 | Yujia Bao, Menghua Wu, Shiyu Chang, Regina Barzilay
单位 | MIT; IBM Research
Pretrained Encyclopedia: Weakly Supervised Knowledge-Pretrained Language Model
链接 | https://openreview.net/pdf?id=BJlzm64tDH
作者 | Wenhan Xiong, Jingfei Du, William Yang Wang, Veselin Stoyanov
单位 | University of California, Santa Barbara; Facebook AI
Self-Adversarial Learning with Comparative Discrimination for Text Generation
链接 | https://openreview.net/pdf?id=B1l8L6EtDS
作者 | Wangchunshu Zhou, Tao Ge, Ke Xu, Furu Wei, Ming Zhou
单位 | Beihang University; Microsoft Research Asia
VL-BERT: Pre-training of Generic Visual-Linguistic Representations
链接 | https://openreview.net/pdf?id=SygXPaEYvH
作者 | Weijie Su, Xizhou Zhu, Yue Cao, Bin Li, Lewei Lu, Furu Wei, Jifeng Dai
单位 | University of Science and Technology of China; Microsoft Research Asia
ReClor: A Reading Comprehension Dataset Requiring Logical Reasoning
链接 | https://openreview.net/pdf?id=HJgJtT4tvB
作者 | Weihao Yu, Zihang Jiang, Yanfei Dong, Jiashi Feng
单位 | National University of Singapore
Low-Resource Knowledge-Grounded Dialogue Generation
链接 | https://openreview.net/pdf?id=rJeIcTNtvS
作者 | Xueliang Zhao, Wei Wu, Chongyang Tao, Can Xu, Dongyan Zhao, Rui Yan
单位 | Peking University; Microsoft
Scalable Neural Methods for Reasoning With a Symbolic Knowledge Base
链接 | https://openreview.net/pdf?id=BJlguT4YPr
作者 | William W. Cohen, Haitian Sun, R. Alex Hofer, Matthew Siegler
单位 | Google
Tree-Structured Attention with Hierarchical Accumulation
链接 | https://openreview.net/pdf?id=HJxK5pEYvr
作者 | Xuan-Phi Nguyen, Shafiq Joty, Steven Hoi, Richard Socher
单位 | Salesforce Research; Nanyang Technological University
Language GANs Falling Short
链接 | https://openreview.net/pdf?id=BJgza6VtPB
作者 | Massimo Caccia, Lucas Caccia, William Fedus, Hugo Larochelle, Joelle Pineau, Laurent Charlin
单位 | MILA; McGill University
Neural Text Generation With Unlikelihood Training
链接 | https://openreview.net/pdf?id=SJeYe0NtvH
作者 | Sean Welleck, Ilia Kulikov, Stephen Roller, Emily Dinan, Kyunghyun Cho, Jason Weston
单位 | New York University; Facebook AI Research; CIFAR
Pre-training Tasks for Embedding-based Large-scale Retrieval
链接 | https://openreview.net/pdf?id=rkg-mA4FDr
作者 | Wei-Cheng Chang, Felix X. Yu, Yin-Wen Chang, Yiming Yang, Sanjiv Kumar
单位 | Carnegie Mellon University; Google
Understanding Knowledge Distillation in Non-autoregressive Machine Translation
链接 | https://openreview.net/pdf?id=BygFVAEKDH
作者 | Chunting Zhou, Jiatao Gu, Graham Neubig
单位 | Carnegie Mellon University; Facebook AI Research
DeFINE: Deep Factorized Input Token Embeddings for Neural Sequence Modeling
链接 | https://openreview.net/pdf?id=rJeXS04FPH
作者 | Sachin Mehta, Rik Koncel-Kedziorski, Mohammad Rastegari, Hannaneh Hajishirzi
单位 | University of Washington; Allen Institute for AI
Precision Gating: Improving Neural Network Efficiency with Dynamic Dual-Precision Activations
链接 | https://openreview.net/pdf?id=SJgVU0EKwS
作者 | Yichi Zhang, Ritchie Zhao, Weizhe Hua, Nayun Xu, G. Edward Suh, Zhiru Zhang
单位 | Cornell University;
Improving Neural Language Generation with Spectrum Control
链接 | https://openreview.net/pdf?id=ByxY8CNtvr
作者 | Lingxiao Wang, Jing Huang, Kevin Huang, Ziniu Hu, Guangtao Wang, Quanquan Gu
单位 | University of California, Los Angeles; JD.com
Improved memory in recurrent neural networks with sequential non-normal dynamics
链接 | https://openreview.net/pdf?id=ryx1wRNFvB
作者 | Emin Orhan, Xaq Pitkow
单位 | New York University; Rice University
Neural Module Networks for Reasoning over Text
链接 | https://openreview.net/pdf?id=SygWvAVFPr
作者 | Nitish Gupta, Kevin Lin, Dan Roth, Sameer Singh, Matt Gardner
单位 | University of Pennsylvania; UC Berkeley; UC Irvine; Allen Institute for AI
Variational Hetero-Encoder Randomized GANs for Joint Image-Text Modeling
链接 | https://openreview.net/pdf?id=H1x5wRVtvS
作者 | Hao Zhang, Bo Chen, Long Tian, Zhengjue Wang, Mingyuan Zhou
单位 | Xidian University; University of Texas at Austin
Augmenting Non-Collaborative Dialog Systems with Explicit Semantic and Strategic Dialog History
链接 | https://openreview.net/pdf?id=ryxQuANKPB
作者 | Yiheng Zhou, Yulia Tsvetkov, Alan W Black, Zhou Yu
单位 | Carnegie Mellon University; UC Davis
NeurQuRI: Neural Question Requirement Inspector for Answerability Prediction in Machine Reading Comprehension
链接 | https://openreview.net/pdf?id=ryxgsCVYPr
作者 | Seohyun Back, Sai Chetan Chinthakindi, Akhil Kedia, Haejun Lee, Jaegul Choo
Generalization through Memorization: Nearest Neighbor Language Models
链接 | https://openreview.net/pdf?id=HklBjCEKvH
作者 | Urvashi Khandelwal, Omer Levy, Dan Jurafsky, Luke Zettlemoyer, Mike Lewis
单位 | Stanford University; Facebook AI Research
Transformer-XH: Multi-Evidence Reasoning with eXtra Hop Attention
链接 | https://openreview.net/pdf?id=r1eIiCNYwS
作者 | Chen Zhao, Chenyan Xiong, Corby Rosset, Xia Song, Paul Bennett, Saurabh Tiwary
单位 | University of Maryland, College Park; Microsoft
Revisiting Self-Training for Neural Sequence Generation
链接 | https://openreview.net/pdf?id=SJgdnAVKDH
作者 | Junxian He, Jiatao Gu, Jiajun Shen, Marc’Aurelio Ranzato
单位 | Carnegie Mellon University; Facebook AI Research
Cross-lingual Alignment vs Joint Training: A Comparative Study and A Simple Unified Framework
链接 | https://openreview.net/pdf?id=S1l-C0NtwS
作者 | Zirui Wang, Jiateng Xie, Ruochen Xu, Yiming Yang, Graham Neubig, Jaime G. Carbonell
单位 | Carnegie Mellon University
Adaptive Correlated Monte Carlo for Contextual Categorical Sequence Generation
链接 | https://openreview.net/pdf?id=r1lOgyrKDS
作者 | Xinjie Fan, Yizhe Zhang, Zhendong Wang, Mingyuan Zhou
单位 | University of Texas at Austin; Microsoft Research; Columbia University
Multilingual Alignment of Contextual Word Representations
链接 | https://openreview.net/pdf?id=r1xCMyBtPS
作者 | Steven Cao, Nikita Kitaev, Dan Klein
单位 | University of California, Berkeley
The Curious Case of Neural Text Degeneration
链接 | https://openreview.net/pdf?id=rygGQyrFvH
作者 | Ari Holtzman, Jan Buys, Li Du, Maxwell Forbes, Yejin Choi
单位 | University of Washington
Query2box: Reasoning over Knowledge Graphs in Vector Space Using Box Embeddings
链接 | https://openreview.net/pdf?id=BJgr4kSFDS
作者 | Hongyu Ren, Weihua Hu, Jure Leskovec
单位 | Stanford University
Plug and Play Language Models: A Simple Approach to Controlled Text Generation
链接 | https://openreview.net/pdf?id=H1edEyBKDS
作者 | Sumanth Dathathri, Andrea Madotto, Janice Lan, Jane Hung, Eric Frank, Piero Molino, Jason Yosinski, Rosanne Liu
单位 | Caltech; HKUST; Uber AI
Towards Verified Robustness under Text Deletion Interventions
链接 | https://openreview.net/pdf?id=SyxhVkrYvr
作者 | Johannes Welbl, Po-Sen Huang, Robert Stanforth, Sven Gowal, Krishnamurthy (Dj) Dvijotham, Martin Szummer, Pushmeet Kohli
单位 | DeepMind; University College London
ELECTRA: Pre-training Text Encoders as Discriminators Rather Than Generators
链接 | https://openreview.net/pdf?id=r1xMH1BtvB
作者 | Kevin Clark, Minh-Thang Luong, Quoc V. Le, Christopher D. Manning
单位 | Stanford University; Google Brain;
Learning to Retrieve Reasoning Paths over Wikipedia Graph for Question Answering
链接 | https://openreview.net/pdf?id=SJgVHkrYDH
作者 | Akari Asai, Kazuma Hashimoto, Hannaneh Hajishirzi, Richard Socher, Caiming Xiong
单位 | University of Washington; Salesforce Research; Allen Institute for Artificial Intelligence
Abductive Commonsense Reasoning
链接 | https://openreview.net/pdf?id=Byg1v1HKDB
作者 | Chandra Bhagavatula, Ronan Le Bras, Chaitanya Malaviya, Keisuke Sakaguchi, Ari Holtzman, Hannah Rashkin, Doug Downey, Wen-tau Yih, Yejin Choi
单位 | Allen Institute for AI; Facebook AI
Robustness Verification for Transformers
链接 | https://openreview.net/pdf?id=BJxwPJHFwS
作者 | Zhouxing Shi, Huan Zhang, Kai-Wei Chang, Minlie Huang, Cho-Jui Hsieh
单位 | Tsinghua University; University of California, Los Angeles
Robustness Verification for Transformers
链接 | https://openreview.net/pdf?id=BJxwPJHFwS
作者 | Zhouxing Shi, Huan Zhang, Kai-Wei Chang, Minlie Huang, Cho-Jui Hsieh
单位 | Tsinghua University; University of California, Los Angeles
Tensor Decompositions for Temporal Knowledge Base Completion
链接 | https://openreview.net/pdf?id=rke2P1BFwS
作者 | Timothée Lacroix, Guillaume Obozinski, Nicolas Usunier
单位 | Facebook AI Research
Probability Calibration for Knowledge Graph Embedding Models
链接 | https://openreview.net/pdf?id=S1g8K1BFwS
作者 | Pedro Tabacof, Luca Costabello
Compressive Transformers for Long-Range Sequence Modelling
链接 | https://openreview.net/pdf?id=SylKikSYDH
作者 | Jack W. Rae, Anna Potapenko, Siddhant M. Jayakumar, Chloe Hillier, Timothy P. Lillicrap
Understanding and Improving Information Transfer in Multi-Task Learning
链接 | https://openreview.net/pdf?id=SylzhkBtDB
作者 | Sen Wu, Hongyang Zhang, Christopher Ré
单位 | Stanford University; University of Pennsylvania
Reducing Transformer Depth on Demand with Structured Dropout
链接 | https://openreview.net/pdf?id=SylO2yStDr
作者 | Angela Fan, Edouard Grave, Armand Joulin
单位 | Facebook AI Research
Cross-Lingual Ability of Multilingual BERT: An Empirical Study
链接 | https://openreview.net/pdf?id=HJeT3yrtDr
作者 | Karthikeyan K, Zihan Wang, Stephen Mayhew, Dan Roth
单位 | University of Illinois Urbana-Champaign; University of Pennsylvania
StructBERT: Incorporating Language Structures into Pre-training for Deep Language Understanding
链接 | https://openreview.net/pdf?id=BJgQ4lSFPH
作者 | Wei Wang, Bin Bi, Ming Yan, Chen Wu, Jiangnan Xia, Zuyi Bao, Liwei Peng, Luo Si
单位 | Alibaba
Permutation Equivariant Models for Compositional Generalization in Language
链接 | https://openreview.net/pdf?id=SylVNerFvr
作者 | Jonathan Gordon, David Lopez-Paz, Marco Baroni, Diane Bouchacourt
单位 | University of Cambridge; Facebook AI Research
Phase Transitions for the Information Bottleneck in Representation Learning
链接 | https://openreview.net/pdf?id=HJloElBYvB
作者 | Tailin Wu, Ian Fischer
单位 | MIT; Google Research
Variational Template Machine for Data-to-Text Generation
链接 | https://openreview.net/pdf?id=HkejNgBtPB
作者 | Rong Ye, Wenxian Shi, Hao Zhou, Zhongyu Wei, Lei Li
单位 | Fudan University
Residual Energy-Based Models for Text Generation
链接 | https://openreview.net/pdf?id=B1l4SgHKDH
作者 | Yuntian Deng, Anton Bakhtin, Myle Ott, Arthur Szlam, Marc’Aurelio Ranzato
单位 | Harvard University; Facebook AI Research
Lite Transformer with Long-Short Range Attention
链接 | https://openreview.net/pdf?id=ByeMPlHKPH
作者 | Zhanghao Wu, Zhijian Liu, Ji Lin, Yujun Lin, Song Han
单位 | MIT; Shanghai Jiaotong University
Compositional Language Continual Learning
链接 | https://openreview.net/pdf?id=rklnDgHtDS
作者 | Yuanpeng Li, Liang Zhao, Kenneth Church, Mohamed Elhoseiny
Differentiable learning of numerical rules in knowledge graphs
链接 | https://openreview.net/pdf?id=rJleKgrKwS
作者 | Po-Wei Wang, Daria Stepanova, Csaba Domokos, J. Zico Kolter
单位 | Carnegie Mellon University; Bosch Center for AI
Understanding Generalization in Recurrent Neural Networks
链接 | https://openreview.net/pdf?id=rkgg6xBYDH
作者 | Zhuozhuo Tu, Fengxiang He, Dacheng Tao
单位 | University of Sydney
Massively Multilingual Sparse Word Representations
链接 | https://openreview.net/pdf?id=HyeYTgrFPB
作者 | Gábor Berend
Monotonic Multihead Attention
链接 | https://openreview.net/pdf?id=Hyg96gBKPS
作者 | Xutai Ma, Juan Miguel Pino, James Cross, Liezl Puzon, Jiatao Gu
单位 | Facebook; Johns Hopkins University
Mixout: Effective Regularization to Finetune Large-scale Pretrained Language Models
链接 | https://openreview.net/pdf?id=HkgaETNtDB
作者 | Cheolhyoung Lee, Kyunghyun Cho, Wanmo Kang
Rethinking the Hyperparameters for Fine-tuning
链接 | https://openreview.net/pdf?id=B1g8VkHFPH
作者 | Hao Li, Pratik Chaudhari, Hao Yang, Michael Lam, Avinash Ravichandran, Rahul Bhotika, Stefano Soatto
单位 | Amazon Web Services; University of Pennsylvania; University of California, Los Angeles
Learning representations for binary-classification without backpropagation
链接 | https://openreview.net/pdf?id=Bke61krFvS
作者 | Mathias Lechner
BatchEnsemble: an Alternative Approach to Efficient Ensemble and Lifelong Learning
链接 | https://openreview.net/pdf?id=Sklf1yrYDr
作者 | Yeming Wen, Dustin Tran, Jimmy Ba
单位 | University of Toronto; Vector Institute; Google Brain
Towards Stabilizing Batch Statistics in Backward Propagation of Batch Normalization
链接 | https://openreview.net/pdf?id=SkgGjRVKDS
作者 | Junjie Yan, Ruosi Wan, Xiangyu Zhang, Wei Zhang, Yichen Wei, Jian Sun
单位 | Fudan University; Megvii Technology
Capsules with Inverted Dot-Product Attention Routing
链接 | https://openreview.net/pdf?id=HJe6uANtwH
作者 | Yao-Hung Hubert Tsai, Nitish Srivastava, Hanlin Goh, Ruslan Salakhutdinov
单位 | Apple Inc.; Carnegie Mellon University
Four Things Everyone Should Know to Improve Batch Normalization
链接 | https://openreview.net/pdf?id=HJx8HANFDH
作者 | Cecilia Summers, Michael J. Dinneen
单位 | University of Auckland
An Exponential Learning Rate Schedule for Deep Learning
链接 | https://openreview.net/pdf?id=rJg8TeSFDH
作者 | Zhiyuan Li, Sanjeev Arora
单位 | Princeton University
想要了解更多的自然语言处理最新进展、技术干货及学习教程,欢迎关注微信公众号“语言智能技术笔记簿”或扫描二维码添加关注。