I just started using tensorflow and I followed the tutorial example on MNIST dataset. It went well, I got like around 90% accuracy.
我刚刚开始使用tensorflow,并遵循了MNIST数据集上的教程示例。它进行得很顺利,我得到了大约90%的准确率。
But after I replace the next_batch
with my own version, the result was way worse than it used to be, usually 50%.
但是当我用我自己的版本替换next_batch之后,结果比以前糟糕得多,通常是50%。
Instead of using the data Tensorflow downloaded and parsed, I download the dataset from this website. Using numpy to get what I want.
我没有使用下载和解析的数据格式,而是从这个网站下载数据集。用numpy得到我想要的。
df = pd.read_csv('mnist_train.csv', header=None)
X = df.drop(0,1)
Y = df[0]
temp = np.zeros((Y.size, Y.max()+1))
temp[np.arange(Y.size),Y] = 1
np.save('X',X)
np.save('Y',temp)
do the same thing to the test data, then following the tutorial, nothing is changed
对测试数据做同样的事情,然后按照本教程,什么都没有改变
x = tf.placeholder(tf.float32, shape=[None, 784])
y_ = tf.placeholder(tf.float32, shape=[None, 10])
X = np.load('X.npy')
Y = np.load('Y.npy')
X_test = np.load('X_test.npy')
Y_test = np.load('Y_test.npy')
BATCHES = 1000
W = tf.Variable(tf.truncated_normal([784,10], stddev=0.1))
# W = tf.Variable(tf.zeros([784,10]))
b = tf.Variable(tf.zeros([10]))
y = tf.nn.softmax(tf.matmul(x, W) + b)
cross_entropy = tf.reduce_mean(tf.nn.softmax_cross_entropy_with_logits(labels=y_, logits=y))
train_step = tf.train.GradientDescentOptimizer(0.05).minimize(cross_entropy)
correct_prediction = tf.equal(tf.argmax(y,1), tf.argmax(y_,1))
accuracy = tf.reduce_mean(tf.cast(correct_prediction, tf.float32))
sess = tf.InteractiveSession()
tf.global_variables_initializer().run()
right here is my own get_mini_batch, I shuffle the original data's index, then every time I get 100 data out of it, which seems to be like the exact same thing example code does. The only difference is data I throw away some of the data in the tail.
这是我自己的get_mini_batch,我洗牌了原始数据的索引,然后每次我得到100个数据,这看起来和示例代码做的完全一样。唯一的区别是我把一些数据丢在了后面。
pos = 0
idx = np.arange(X.shape[0])
np.random.shuffle(idx)
for _ in range(1000):
batch_xs, batch_ys = X[idx[range(pos,pos+BATCHES)],:], Y[idx[range(pos,pos+BATCHES)],]
if pos+BATCHES >= X.shape[0]:
pos = 0
idx = np.arange(X.shape[0])
np.random.shuffle(idx)
pos += BATCHES
sess.run(train_step, feed_dict={x: batch_xs, y_: batch_ys})
print(sess.run(accuracy, feed_dict={x: X_test, y_: Y_test}))
It confuses me why my version is way worse than the tutorial one.
我搞不懂为什么我的版本比教程的版本更糟糕。
1 个解决方案
#1
0
Like lejilot said, we should normalize the data before we push it into the neural network. See this post
就像lejilot说的,我们应该在把数据推进到神经网络之前把数据标准化。看到这篇文章
#1
0
Like lejilot said, we should normalize the data before we push it into the neural network. See this post
就像lejilot说的,我们应该在把数据推进到神经网络之前把数据标准化。看到这篇文章