I want to get the count of an element in a tensor, for example, t=[1,2,0,0,0,0] (t is a tensor), I can get the amount, 4, of '0' by calling t.count(0) in python, but in tensorflow, I can't find any functions to do this. How can i get the count of '0'? Could please someone help me?
我想获得张量中元素的计数,例如,t = [1,2,0,0,0,0](t是张量),我可以得到'0'的数量4通过在python中调用t.count(0),但在tensorflow中,我找不到任何函数来执行此操作。我如何得到'0'的数量?可以请别人帮帮我吗?
4 个解决方案
#1
6
There isn't a built in count method in tensor flow right now. But you could do it using the existing tools in a method like so:
现在张量流中没有内置计数方法。但您可以使用现有工具在类似的方法中执行此操作:
def tf_count(t, val):
elements_equal_to_value = tf.equal(t, val)
as_ints = tf.cast(elements_equal_to_value, tf.int32)
count = tf.reduce_sum(as_ints)
return count
#2
5
To count just a specific element you can create a boolean mask, convert it to int
and sum it up:
要只计算一个特定元素,您可以创建一个布尔掩码,将其转换为int并将其求和:
import tensorflow as tf
X = tf.constant([6, 3, 3, 3, 0, 1, 3, 6, 7])
res = tf.reduce_sum(tf.cast(tf.equal(X, 3), tf.int32))
with tf.Session() as sess:
print sess.run(res)
Also you can count every element in the list/tensor using tf.unique_with_counts;
您还可以使用tf.unique_with_counts计算列表/张量中的每个元素;
import tensorflow as tf
X = tf.constant([6, 3, 3, 3, 0, 1, 3, 6, 7])
y, idx, cnts = tf.unique_with_counts(X)
with tf.Session() as sess:
a, _, b = sess.run([y, idx, cnts])
print a
print b
#3
1
An addition to Slater's answer above. If you want to get the count of all the elements, you can use one_hot
and reduce_sum
to avoid any looping within python. For example, the code-snippet below returns a vocab, ordered by occurrences within a word_tensor.
Slater上面回答的补充。如果要获取所有元素的计数,可以使用one_hot和reduce_sum来避免在python中进行任何循环。例如,下面的代码片段返回一个词汇,按word_tensor中的出现次序排序。
def build_vocab(word_tensor, vocab_size):
unique, idx = tf.unique(word_tensor)
counts_one_hot = tf.one_hot(
idx,
tf.shape(unique)[0],
dtype=tf.int32
)
counts = tf.reduce_sum(counts_one_hot, 0)
_, indices = tf.nn.top_k(counts, k=vocab_size)
return tf.gather(unique, indices)
EDIT: After a little experimentation, I discovered it's pretty easy for the one_hot
tensor to blow up beyond TF's maximum tensor size. It's likely more efficient (if a little less elegant) to replace the counts
call with something like this:
编辑:经过一些实验,我发现one_hot张量很容易超过TF的最大张量大小。使用类似的东西替换计数调用可能更有效(如果不那么优雅):
counts = tf.foldl(
lambda counts, item: counts + tf.one_hot(
item, tf.shape(unique)[0], dtype=tf.int32),
idx,
initializer=tf.zeros_like(unique, dtype=tf.int32),
back_prop=False
)
#4
1
If you want to get the counts of all integers up to n
in tensor t
, you can use tf.unsorted_segment_sum like this:
如果你想得到张量为n的所有整数的计数,你可以像这样使用tf.unsorted_segment_sum:
count_all = tf.unsorted_segment_sum(tf.ones_like(t), t, n)
count_all
will be similar to a histogram, now.
count_all现在类似于直方图。
e.g. count_all[0]
will tell you the number of times the number 0 appears in tensor t
:
例如count_all [0]会告诉你数字0出现在张量中的次数:
t = tf.placeholder(tf.int32)
count_all = tf.unsorted_segment_sum(tf.ones_like(t), t, 3)
sess.run(count_all[0], {t: [1,2,0,0,0,0]})
# returns 4
sess.run(count_all, {t: [1,2,0,0,0,0]})
# returns array([4, 1, 1], dtype=int32)
Unfortunately this only works assuming there is no batch dimension. And, as Eli Bixby pointed out, the faster way with assigning one_hot can take up too much memory. My personal preference for getting around this is to use tf.map_fn like this:
不幸的是,只有在没有批量维度的情况下才有效。而且,正如Eli Bixby指出的那样,分配one_hot的更快方式会占用太多内存。我个人喜欢解决这个问题,就是像这样使用tf.map_fn:
def count_all_fnc(e):
return tf.unsorted_segment_sum(tf.ones_like(e), e, n)
count_all = tf.map_fn(count_all_fnc, t)
e.g.
n = 3
t = tf.placeholder(tf.int32)
def count_all_fnc(e):
return tf.unsorted_segment_sum(tf.ones_like(e), e, n)
count_all = tf.map_fn(count_all_fnc, t)
sess.run(count_all, {t:[[1, 0, 0, 2], [1, 2, 0, 0], [0, 0, 0, 0], [1, 1, 1, 2]]})
Returns
array([[2, 1, 1],
[2, 1, 1],
[4, 0, 0],
[0, 3, 1]], dtype=int32)
If you have the memory available, it is much faster (about 10x) to index a lookup of one-hot vectors representing each index and sum them together, because this is highly parallelisable. But the space requirement grows as n*|t|
so this quickly becomes infeasible.
如果你有可用的内存,那么索引表示每个索引的单热矢量的索引要快得多(大约10倍),并将它们加在一起,因为这是高度并行的。但空间需求增长为n * | t |所以这很快就变得不可行了。
one_hot_t = tf.one_hot(t, n)
count_all = tf.reduce_sum(one_hot_t, axis=1)
#1
6
There isn't a built in count method in tensor flow right now. But you could do it using the existing tools in a method like so:
现在张量流中没有内置计数方法。但您可以使用现有工具在类似的方法中执行此操作:
def tf_count(t, val):
elements_equal_to_value = tf.equal(t, val)
as_ints = tf.cast(elements_equal_to_value, tf.int32)
count = tf.reduce_sum(as_ints)
return count
#2
5
To count just a specific element you can create a boolean mask, convert it to int
and sum it up:
要只计算一个特定元素,您可以创建一个布尔掩码,将其转换为int并将其求和:
import tensorflow as tf
X = tf.constant([6, 3, 3, 3, 0, 1, 3, 6, 7])
res = tf.reduce_sum(tf.cast(tf.equal(X, 3), tf.int32))
with tf.Session() as sess:
print sess.run(res)
Also you can count every element in the list/tensor using tf.unique_with_counts;
您还可以使用tf.unique_with_counts计算列表/张量中的每个元素;
import tensorflow as tf
X = tf.constant([6, 3, 3, 3, 0, 1, 3, 6, 7])
y, idx, cnts = tf.unique_with_counts(X)
with tf.Session() as sess:
a, _, b = sess.run([y, idx, cnts])
print a
print b
#3
1
An addition to Slater's answer above. If you want to get the count of all the elements, you can use one_hot
and reduce_sum
to avoid any looping within python. For example, the code-snippet below returns a vocab, ordered by occurrences within a word_tensor.
Slater上面回答的补充。如果要获取所有元素的计数,可以使用one_hot和reduce_sum来避免在python中进行任何循环。例如,下面的代码片段返回一个词汇,按word_tensor中的出现次序排序。
def build_vocab(word_tensor, vocab_size):
unique, idx = tf.unique(word_tensor)
counts_one_hot = tf.one_hot(
idx,
tf.shape(unique)[0],
dtype=tf.int32
)
counts = tf.reduce_sum(counts_one_hot, 0)
_, indices = tf.nn.top_k(counts, k=vocab_size)
return tf.gather(unique, indices)
EDIT: After a little experimentation, I discovered it's pretty easy for the one_hot
tensor to blow up beyond TF's maximum tensor size. It's likely more efficient (if a little less elegant) to replace the counts
call with something like this:
编辑:经过一些实验,我发现one_hot张量很容易超过TF的最大张量大小。使用类似的东西替换计数调用可能更有效(如果不那么优雅):
counts = tf.foldl(
lambda counts, item: counts + tf.one_hot(
item, tf.shape(unique)[0], dtype=tf.int32),
idx,
initializer=tf.zeros_like(unique, dtype=tf.int32),
back_prop=False
)
#4
1
If you want to get the counts of all integers up to n
in tensor t
, you can use tf.unsorted_segment_sum like this:
如果你想得到张量为n的所有整数的计数,你可以像这样使用tf.unsorted_segment_sum:
count_all = tf.unsorted_segment_sum(tf.ones_like(t), t, n)
count_all
will be similar to a histogram, now.
count_all现在类似于直方图。
e.g. count_all[0]
will tell you the number of times the number 0 appears in tensor t
:
例如count_all [0]会告诉你数字0出现在张量中的次数:
t = tf.placeholder(tf.int32)
count_all = tf.unsorted_segment_sum(tf.ones_like(t), t, 3)
sess.run(count_all[0], {t: [1,2,0,0,0,0]})
# returns 4
sess.run(count_all, {t: [1,2,0,0,0,0]})
# returns array([4, 1, 1], dtype=int32)
Unfortunately this only works assuming there is no batch dimension. And, as Eli Bixby pointed out, the faster way with assigning one_hot can take up too much memory. My personal preference for getting around this is to use tf.map_fn like this:
不幸的是,只有在没有批量维度的情况下才有效。而且,正如Eli Bixby指出的那样,分配one_hot的更快方式会占用太多内存。我个人喜欢解决这个问题,就是像这样使用tf.map_fn:
def count_all_fnc(e):
return tf.unsorted_segment_sum(tf.ones_like(e), e, n)
count_all = tf.map_fn(count_all_fnc, t)
e.g.
n = 3
t = tf.placeholder(tf.int32)
def count_all_fnc(e):
return tf.unsorted_segment_sum(tf.ones_like(e), e, n)
count_all = tf.map_fn(count_all_fnc, t)
sess.run(count_all, {t:[[1, 0, 0, 2], [1, 2, 0, 0], [0, 0, 0, 0], [1, 1, 1, 2]]})
Returns
array([[2, 1, 1],
[2, 1, 1],
[4, 0, 0],
[0, 3, 1]], dtype=int32)
If you have the memory available, it is much faster (about 10x) to index a lookup of one-hot vectors representing each index and sum them together, because this is highly parallelisable. But the space requirement grows as n*|t|
so this quickly becomes infeasible.
如果你有可用的内存,那么索引表示每个索引的单热矢量的索引要快得多(大约10倍),并将它们加在一起,因为这是高度并行的。但空间需求增长为n * | t |所以这很快就变得不可行了。
one_hot_t = tf.one_hot(t, n)
count_all = tf.reduce_sum(one_hot_t, axis=1)