【文件属性】:
文件名称:End-to-End Speech and Language Processing
文件大小:446KB
文件格式:PDF
更新时间:2021-02-26 20:37:56
Speech and Language Processing
端到端的语音处理系统,Recently, encoder-decoder neural networks
have shown impressive performance
on many sequence-related tasks.
The architecture commonly uses an attentional
mechanism which allows the model
to learn alignments between the source
and the target sequence. Most attentional
mechanisms used today is based on a
global attention property which requires
a computation of a weighted summarization
of the whole input sequence generated
by encoder states. However, it is
computationally expensive and often produces
misalignment on the longer input
sequence. Furthermore, it does not fit
with monotonous or left-to-right nature in
several tasks, such as automatic speech
recognition (ASR), grapheme-to-phoneme
(G2P), etc. In this paper, we propose a
novel attention mechanism that has local
and monotonic properties. Various ways
to control those properties are also explored.
Experimental results on ASR, G2P
and machine translation between two languages
with similar sentence structures,
demonstrate that the proposed encoderdecoder
model with local monotonic attention
could achieve significant performance
improvements and reduce the computational
complexity in comparison with
the one that used the standard global attention
architecture.