Language Modeling with Gated Convolutional Networks

时间:2021-09-25 04:53:59
【文件属性】:

文件名称:Language Modeling with Gated Convolutional Networks

文件大小:572KB

文件格式:PDF

更新时间:2021-09-25 04:53:59

深度学习 理论

The pre-dominant approach to language model- ing to date is based on recurrent neural networks. In this paper we present a convolutional approach to language modeling. We introduce a novel gating mechanism that eases gradient propaga- tion and which performs better than the LSTM- style gating of Oord et al. (2016b) despite being simpler. We achieve a new state of the art on WikiText-103 as well as a new best single-GPU result on the Google Billion Word benchmark. In settings where latency is important, our model achieves an order of magnitude speed-up com- pared to a recurrent baseline since computation can be parallelized over time. To our knowledge, this is the first time a non-recurrent approach out- performs strong recurrent models on these tasks.


网友评论