I am a beginner with Theano, and I am working from an example of another's code that, presumably, worked at some point (however, I have modified it...but I'm pretty sure my modifications have nothing to do with what is going wrong at the moment).
我是Theano的初学者,我正在研究另一种代码的例子,大概是在某个时候工作过(不过,我已经修改了它……)但我很确定我的修改与目前的错误无关。
Anyhow, I am trying to debug a Theano Scan...and I think what I am observing is a fundamental error in the scan function.
不管怎么说,我想调试一下Theano扫描…我认为我观察到的是扫描函数的一个基本误差。
U, V, W = self.U, self.V, self.W
x = T.ivector('x')
y = T.ivector('y')
def forward_prop_step(x_t, s_t_prev, U, V, W):
s_t = T.tanh(U.dot(x_t) + V.dot(s_t_prev))
o_t = T.tanh(W.dot(s_t))
return [o_t,s_t]
[o,s], updates = theano.scan(
forward_prop_step,
sequences=x,
outputs_info=[None, dict(initial=T.zeros(self.hidden_dim))],
non_sequences=[U, V, W],
truncate_gradient=self.bptt_truncate,
strict=True)
U
is an m x n
matrix, V
is an n x n
matrix, and W
is an n x o
matrix...and self.bptt_truncate
is a scalar (4). but I don't think the internals of my function are what are failing at the moment.
U是一个m×n矩阵,V是一个n×n的矩阵,而W是一个n x o矩阵…和自我。bptt_truncate是一个标量(4),但我不认为我的函数的内部结构是现在失败的。
The error I get is:
我得到的错误是:
ValueError: When compiling the inner function of scan the following error has been encountered: The initial state (
outputs_info
in scan nomenclature) of variable IncSubtensor{Set;:int64:}.0 (argument number 1) has 2 dimension(s), while the result of the inner function (fn
) has 2 dimension(s) (should be one less than the initial state).ValueError:在编译扫描的内部函数时,遇到了以下错误:初始状态(扫描命名的outputs_info),变量incsub{Set;:int64:}。0(参数1)有2维(s),而内部函数(fn)的结果有2维(s)(应该小于初始状态)。
I have tried altering the dimensions of outputs_info, and the return dimensions of forward_prop_step
, but nothing seems to work so far.
我尝试过改变outputs_info的维度,并尝试了向前的返回维度,但是到目前为止还没有任何工作。
I am currently looking into the documentation...but, from the documentation, it seems like what I am doing is correct (below is the example from the documentation):
我目前正在查阅文件……但是,从文档中可以看出,我所做的是正确的(以下是来自文档的示例):
def oneStep(u_tm4, u_t, x_tm3, x_tm1, y_tm1, W, W_in_1, W_in_2, W_feedback, W_out):
x_t = T.tanh(theano.dot(x_tm1, W) + \
theano.dot(u_t, W_in_1) + \
theano.dot(u_tm4, W_in_2) + \
theano.dot(y_tm1, W_feedback))
y_t = theano.dot(x_tm3, W_out)
return [x_t, y_t]
And here is the documentation scan:
这是文件扫描:
W = T.matrix()
W_in_1 = T.matrix()
W_in_2 = T.matrix()
W_feedback = T.matrix()
W_out = T.matrix()
u = T.matrix() # it is a sequence of vectors
x0 = T.matrix() # initial state of x has to be a matrix, since
# it has to cover x[-3]
y0 = T.vector() # y0 is just a vector since scan has only to provide
# y[-1]
([x_vals, y_vals], updates) = theano.scan(fn=oneStep,
sequences=dict(input=u, taps=[-4,-0]),
outputs_info=[dict(initial=x0, taps=[-3,-1]), y0],
non_sequences=[W, W_in_1, W_in_2, W_feedback, W_out],
strict=True)
# for second input y, scan adds -1 in output_taps by default
The return of the function is: '[x_t,y_t]' and the outputs_info
is [dict(initial=x0, taps=[-3,-1]), y0]
...
函数的返回是:'[x_t,y_t]'和outputs_info是[dict(初始=x0,点击=[-3,-1]),y0]…
While in my implementation, the return of the function is: [o_t,s_t]
and the outputs_info
is [None, dict(initial=T.zeros(self.hidden_dim))]
...which makes sense, since I have no reason to pass my output into the function...
在我的实现中,函数的返回是:[o_t,s_t]和outputs_info是[None, dict(初始=T.zeros(self。hidden_dim))]……这是有意义的,因为我没有理由把我的输出传递到函数中…
2 个解决方案
#1
1
I had exactly the same problem when applying RNN for NLP task. This error occurs because of the type of x_t
argument of forward_prop_step
function, which is scalar because of iterating through the ivector x
.
在应用RNN进行NLP任务时,我遇到了同样的问题。这个错误是由于prod_prop_step函数的x_t参数的类型导致的,这是一个标量,因为它是通过ivector x进行迭代的。
Solution here is to use a vector. Here, for example, x_tv
is a vector which have all zeros and 1 at x_t
index.
这里的解是用一个向量。例如,x_tv是一个在x_t索引中具有所有零和1的向量。
def forward_prop_step(x_t, s_t_prev, U, V, W):
x_tv = T.eye(1, m=input_size, k=x_t)[0]
s_t = T.tanh(U.dot(x_tv) + V.dot(s_t_prev))
o_t = T.tanh(W.dot(s_t))
return [o_t, s_t]
#2
0
Try the following? Note the difference with (self.hidden_dim, )
and (self.hidden_dim)
试试以下?注意与(self)的区别。hidden_dim)和(self.hidden_dim)
outputs_info=[None, dict(initial=T.zeros((self.hidden_dim, )))],
#1
1
I had exactly the same problem when applying RNN for NLP task. This error occurs because of the type of x_t
argument of forward_prop_step
function, which is scalar because of iterating through the ivector x
.
在应用RNN进行NLP任务时,我遇到了同样的问题。这个错误是由于prod_prop_step函数的x_t参数的类型导致的,这是一个标量,因为它是通过ivector x进行迭代的。
Solution here is to use a vector. Here, for example, x_tv
is a vector which have all zeros and 1 at x_t
index.
这里的解是用一个向量。例如,x_tv是一个在x_t索引中具有所有零和1的向量。
def forward_prop_step(x_t, s_t_prev, U, V, W):
x_tv = T.eye(1, m=input_size, k=x_t)[0]
s_t = T.tanh(U.dot(x_tv) + V.dot(s_t_prev))
o_t = T.tanh(W.dot(s_t))
return [o_t, s_t]
#2
0
Try the following? Note the difference with (self.hidden_dim, )
and (self.hidden_dim)
试试以下?注意与(self)的区别。hidden_dim)和(self.hidden_dim)
outputs_info=[None, dict(initial=T.zeros((self.hidden_dim, )))],