文件名称:Language Modeling with Gated Convolutional Networks
文件大小:572KB
文件格式:PDF
更新时间:2021-09-25 04:53:59
深度学习 理论
The pre-dominant approach to language model- ing to date is based on recurrent neural networks. In this paper we present a convolutional approach to language modeling. We introduce a novel gating mechanism that eases gradient propaga- tion and which performs better than the LSTM- style gating of Oord et al. (2016b) despite being simpler. We achieve a new state of the art on WikiText-103 as well as a new best single-GPU result on the Google Billion Word benchmark. In settings where latency is important, our model achieves an order of magnitude speed-up com- pared to a recurrent baseline since computation can be parallelized over time. To our knowledge, this is the first time a non-recurrent approach out- performs strong recurrent models on these tasks.