Can someone explain how can I initialize hidden state of LSTM in tensorflow? I am trying to build LSTM recurrent auto-encoder, so after i have that model trained i want to transfer learned hidden state of unsupervised model to hidden state of supervised model. Is that even possible with current API? This is paper I am trying to recreate:
有人可以解释如何在tensorflow中初始化LSTM的隐藏状态?我正在尝试构建LSTM循环自动编码器,所以在我训练了该模型后,我想将学习到的无监督模型的隐藏状态转移到监督模型的隐藏状态。这是否可以使用当前的API?这是我试图重新创建的论文:
http://papers.nips.cc/paper/5949-semi-supervised-sequence-learning.pdf
http://papers.nips.cc/paper/5949-semi-supervised-sequence-learning.pdf
1 个解决方案
#1
7
Yes - this is possible but truly cumbersome. Let's go through an example.
是的 - 这是可能的,但真的很麻烦。我们来看一个例子吧。
-
Defining a model:
定义模型:
from keras.layers import LSTM, Input from keras.models import Model input = Input(batch_shape=(32, 10, 1)) lstm_layer = LSTM(10, stateful=True)(input) model = Model(input, lstm_layer) model.compile(optimizer="adam", loss="mse")
It's important to build and compile model first as in compilation the initial states are reset. Moreover - you need to specify a
batch_shape
wherebatch_size
is specified as in this scenario our network should bestateful
(which is done by setting astateful=True
mode.首先构建和编译模型非常重要,因为在编译时初始状态被重置。此外 - 您需要指定batch_shape,其中指定了batch_size,因为在此场景中我们的网络应该是有状态的(通过设置有状态= True模式来完成)。
-
Now we could set the values of initial states:
现在我们可以设置初始状态的值:
import numpy import keras.backend as K hidden_states = K.variable(value=numpy.random.normal(size=(32, 10))) cell_states = K.variable(value=numpy.random.normal(size=(32, 10))) model.layers[1].states[0] = hidden_states model.layers[1].states[1] = cell_states
Note that you need to provide states as a
keras
variables.states[0]
holds hidden states andstates[1]
holds cell states.请注意,您需要提供状态作为keras变量。 states [0]保持隐藏状态,状态[1]保存单元状态。
Hope that helps.
希望有所帮助。
#1
7
Yes - this is possible but truly cumbersome. Let's go through an example.
是的 - 这是可能的,但真的很麻烦。我们来看一个例子吧。
-
Defining a model:
定义模型:
from keras.layers import LSTM, Input from keras.models import Model input = Input(batch_shape=(32, 10, 1)) lstm_layer = LSTM(10, stateful=True)(input) model = Model(input, lstm_layer) model.compile(optimizer="adam", loss="mse")
It's important to build and compile model first as in compilation the initial states are reset. Moreover - you need to specify a
batch_shape
wherebatch_size
is specified as in this scenario our network should bestateful
(which is done by setting astateful=True
mode.首先构建和编译模型非常重要,因为在编译时初始状态被重置。此外 - 您需要指定batch_shape,其中指定了batch_size,因为在此场景中我们的网络应该是有状态的(通过设置有状态= True模式来完成)。
-
Now we could set the values of initial states:
现在我们可以设置初始状态的值:
import numpy import keras.backend as K hidden_states = K.variable(value=numpy.random.normal(size=(32, 10))) cell_states = K.variable(value=numpy.random.normal(size=(32, 10))) model.layers[1].states[0] = hidden_states model.layers[1].states[1] = cell_states
Note that you need to provide states as a
keras
variables.states[0]
holds hidden states andstates[1]
holds cell states.请注意,您需要提供状态作为keras变量。 states [0]保持隐藏状态,状态[1]保存单元状态。
Hope that helps.
希望有所帮助。