Learning Recurrent Neural Networks with Hessian-Free Optimization下载

【文件属性】：

文件名称：Learning Recurrent Neural Networks with Hessian-Free Optimization

文件大小：295KB

文件格式：PDF

更新时间：2021-08-11 02:56:14

Recurrent Neural Networks Hessian

James Martens JMARTENS @ CS . TORONTO . EDU Ilya Sutskever ILYA @ CS . UTORONTO . CA University of Toronto, Canada Abstract In this work we resolve the long-outstanding problem of how to effectively train recurrent neu- ral networks (RNNs) on complex and difficult sequence modeling problems which may con- tain long-term data dependencies. Utilizing re- cent advances in the Hessian-free optimization approach (Martens, 2010), together with a novel damping scheme, we successfully train RNNs on two sets of challenging problems. First, a col- lection of pathological synthetic datasets which are known to be impossible for standard op- timization approaches (due to their extremely long-term dependencies), and second, on three natural and highly complex real-world sequence datasets where we find that our method sig- nificantly outperforms the previous state-of-the- art method for training neural sequence mod- els: the Long Short-term Memory approach of Hochreiter and Schmidhuber (1997). Addition- ally, we offer a new interpretation of the gen- eralized Gauss-Newton matrix of Schraudolph (2002) which is used within the HF approach of Martens.

立即下载

秒客网

Learning Recurrent Neural Networks with Hessian-Free Optimization

网友评论

相关文章