I have a large numpy arrays (X) which I can load onto the CPU but it is too big for the GPU/Tensorflow.I would like to perform array operations on X using tensorflow so I break up the array into batches (using numpy), feed it to tensorflow, and then finally concatenate the final output arrays to give me the numpy array Y. I am new to tensorflow so I think there should be a better/faster way to feed in the numpy array.
我有一个大的numpy数组(X)我可以加载到CPU但它对于GPU / Tensorflow来说太大了。我想使用tensorflow在X上执行数组操作所以我将数组分成批处理(使用numpy) ,将它连接到tensorflow,然后最终连接最终输出数组,给我numpy数组Y.我是tensorflow的新手,所以我认为应该有更好/更快的方式来输入numpy数组。
#X is a large numpy array
#batches is an integer which defines the number of batches
X_list = np.array_split(X,batches)
X_tf = tf.placeholder(tf.float32)
Y_tf = some_function(X_tf)
for batch in range(batches):
sess = tf.Session()
Y_list.append(sess.run(Y_tf, feed_dict={X_tf: X_list[batch]}))
Y = np.hstack(Y_list)
1 个解决方案
You should look at the tensorflow dataset class, as it has capability of handling large np arrays. As long as the array can fit in memory, it can be loaded and batched however you want.
A basic implementation would look like (more detail here)
#load np array X
#make placeholders for dataset
X_placeholder = tf.placeholder(dtype=tf.float32, shape=X.shape)
#make data set from placeholders
dataset = Dataset.from_tensor_slices((X_placeholder))
dataset = dataset.batch(batch_size)
You should look at the tensorflow dataset class, as it has capability of handling large np arrays. As long as the array can fit in memory, it can be loaded and batched however you want.
A basic implementation would look like (more detail here)
#load np array X
#make placeholders for dataset
X_placeholder = tf.placeholder(dtype=tf.float32, shape=X.shape)
#make data set from placeholders
dataset = Dataset.from_tensor_slices((X_placeholder))
dataset = dataset.batch(batch_size)