tensorflow矩阵和向量的加法：broadcasting机制

x_data = np.linspace(-1, 1, 300, dtype=np.float32)[:, np.newaxis]
print(x_data.shape)

xs = tf.placeholder(tf.float32, [None, 1])

Weight = tf.Variable(tf.random_normal([1, 10])) 
Weight_shape = tf.shape(Weight)

biases = tf.Variable(tf.zeros([1, 10]) + 0.1)  
biases_shape = tf.shape(biases)

Wx_plus_b = tf.matmul(xs, Weight) + biases
Wx_plus_b_shape = tf.shape(Wx_plus_b)

init = tf.global_variables_initializer()

with tf.Session() as sess:
    sess.run(init)
    print(sess.run(Weight_shape))
    print(sess.run(biases_shape))
    print(sess.run(Wx_plus_b_shape, feed_dict={xs: x_data}))

一个300*10的矩阵和一个1*10的向量相加。和直觉相悖。
其实是在tensorflow中允许矩阵和向量相加。

C = A + b

即

C_{i j} = A_{i j} + b_{j}

。也就是给矩阵A的每一行都加上向量b。
那么这至少要求矩阵的列数和向量的元素个数对齐。
这种隐式的复制向量b到很多位置的办法，叫做broadcasting。广播机制。

秒客网

tensorflow矩阵和向量的加法：broadcasting机制

tensorflow矩阵和向量的加法：broadcasting机制

相关文章