错误的维数:期望的0,得到1的形状(1,)

时间:2022-11-15 22:00:05

I am doing word-level language modelling with a vanilla rnn, I am able to train the model but for some weird reasons I am not able to get any samples/predictions from the model; here is the relevant part of the code:

我用一个香草的rnn来做单词级的语言建模,我能够训练模型,但是由于一些奇怪的原因,我不能从模型中得到任何样本/预测;以下是代码的相关部分:

train_set_x, train_set_y, voc = load_data(dataset, vocab, vocab_enc)  # just load all data as shared variables
index = T.lscalar('index')
x = T.fmatrix('x')
y = T.ivector('y')
n_x = len(vocab)
n_h = 100
n_y = len(vocab)

rnn = Rnn(input=x, input_dim=n_x, hidden_dim=n_h, output_dim=n_y)

cost = rnn.negative_log_likelihood(y)

updates = get_optimizer(optimizer, cost, rnn.params, learning_rate)

train_model = theano.function(
    inputs=[index],
    outputs=cost,
    givens={
        x: train_set_x[index],
        y: train_set_y[index]
    },
    updates=updates
)

predict_model = theano.function(
    inputs=[index],
    outputs=rnn.y,
    givens={
        x: voc[index]
    }
)

sampling_freq = 2
sample_length = 10
n_train_examples = train_set_x.get_value(borrow=True).shape[0]
train_cost = 0.
for i in xrange(n_train_examples):
    train_cost += train_model(i)
    train_cost /= n_train_examples

    if i % sampling_freq == 0:
       # sample from the model     
       seed = randint(0, len(vocab)-1)
       idxes = []
       for j in xrange(sample_length):
           p = predict_model(seed)
           seed = p
           idxes.append(p)
           # sample = ''.join(ix_to_words[ix] for ix in idxes)
           # print(sample)

I get the error: "TypeError: ('Bad input argument to theano function with name "train.py:94" at index 0(0-based)', 'Wrong number of dimensions: expected 0, got 1 with shape (1,).')"

我得到了错误:“TypeError(输入错误):(对theano函数的输入参数错误)。py:94“在索引0(以0为基础)”,错误的维度数:预期的0,得到1的形状(1,)。

Now this corresponds to the following line (in the predict_model):

现在,这对应于以下行(在谓词模型中):

 givens={   x: voc[index]   }

Even after spending hours I am not able to comprehend how could there be a dimension mis-match when:

即使是在花了几个小时之后,我也无法理解为什么会有一个维度错误匹配:

train_set_x has shape: (42, 4, 109)
voc has shape: (109, 1, 109)

And when I do train_set_x[index], I am getting (4, 109) which 'x' Tensor of type fmatrix can hold (this is what happens in train_model) but when I do voc[index], I am getting (1, 109), which is also a matrix but 'x' cannot hold this, why ? !

当我做train_set_x[索引]时,我得到(4,109)f矩阵的“x”张量可以保持(这是在train_model中发生的情况),但当我做voc[索引]时,我得到(1,109),这也是一个矩阵,但“x”不能持有这个,为什么?!

Any help will be much appreciated.

非常感谢您的帮助。

Thanks !

谢谢!

1 个解决方案

#1


2  

The error message refers to the definition of the whole Theano function named predict_model, not the specific line where the substitution with givens occurs.

错误消息指的是整个Theano函数的定义,该函数命名为predict_model,而不是用givens替换的特定行。

The issue seems to be that predict_model gets called with an argument that is a vector of length 1 instead of a scalar. The initial seed sampled from randint is actually a scalar, but I would guess that the output p of predict_model(seed) is a vector and not a scalar.

问题似乎是谓词模型被调用的参数是长度为1的向量而不是标量。从randint抽取的初始种子实际上是一个标量,但是我猜预测模型的输出p (seed)是一个向量,而不是一个标量。

In that case, you could either return rnn.y[0] in predict_model, or replace seed = p with seed = p[0] in the loop over j.

在这种情况下,你可以返回rnn。[0]在预测模型中,或在j的循环中替换seed = p[0]的种子= p。

#1


2  

The error message refers to the definition of the whole Theano function named predict_model, not the specific line where the substitution with givens occurs.

错误消息指的是整个Theano函数的定义,该函数命名为predict_model,而不是用givens替换的特定行。

The issue seems to be that predict_model gets called with an argument that is a vector of length 1 instead of a scalar. The initial seed sampled from randint is actually a scalar, but I would guess that the output p of predict_model(seed) is a vector and not a scalar.

问题似乎是谓词模型被调用的参数是长度为1的向量而不是标量。从randint抽取的初始种子实际上是一个标量,但是我猜预测模型的输出p (seed)是一个向量,而不是一个标量。

In that case, you could either return rnn.y[0] in predict_model, or replace seed = p with seed = p[0] in the loop over j.

在这种情况下,你可以返回rnn。[0]在预测模型中,或在j的循环中替换seed = p[0]的种子= p。