I have some issues making this custom loss function (it checks if y_pred
data is ordered coherently with the real ordering indices provided by y_true
) work:
我有一些问题使这个自定义丢失函数(它检查y_pred数据是否与y_true提供的实际排序索引一致):
def custom_objective(y_true, y_pred):
y_true = tf.cast(y_true, tf.float32)
ordered_output = tf.cast(tf.nn.top_k(-y_pred, k=5)[1], tf.float32)
return tf.sqrt(tf.reduce_mean(tf.square(ordered_output - y_true), axis=-1))
I can properly run it with sample data:
我可以用样本数据正确地运行它:
with tf.Session() as sess:
print(custom_objective(tf.constant([0, 1, 2, 3, 4, 5]),
tf.constant([0.0, 0.9, 0.2, 0.3, 0.5, 0.8])).eval()) # 1.82574
But somehow it doesn't work if I use it in model.compile
, as it raises:
但是,如果我在模型中使用它,它就不起作用了。
/Users/luca/.virtualenvs/python3/lib/python3.6/site-packages/tensorflow/python/framework/tensor_util.py in make_tensor_proto(values, dtype, shape, verify_shape)
358 else:
359 if values is None:
--> 360 raise ValueError("None values not supported.")
361 # if dtype is provided, forces numpy array to be the type
362 # provided if possible.
ValueError: None values not supported.
Note that there is no "None" values in my training test set if I change ordered_output = tf.cast(tf.nn.top_k(-y_pred, k=5)[1], tf.float32)
to ordered_output = -y_pred
the model compiles fine and starts training properly (but it's clearly not the loss function I want).
注意,如果我更改ordered_output = tf.cast(tf.n .cast),那么在我的训练测试集中没有“None”值。top_k(-y_pred, k=5)[1], tf.float32) ordered_output = -y_pred模型编译良好并正确地开始训练(但它显然不是我想要的损失函数)。
I have the subtle feeling that there might be something wrong in using top_k
in a loss function as I don't see how it could be differentiable, but I don't have better ideas for evaluating differences in predicted ordering. Hints/ideas/papers/references? :)
我有一种微妙的感觉,使用top_k在一个损失函数中可能有问题,因为我不知道它是如何可微的,但是我没有更好的方法来评估预测排序中的差异。提示/想法/论文/引用?:)
1 个解决方案
#1
3
This might be voted down as I won't really fix your code but here goes nothing :
这可能会被否决,因为我不会真的修改你的代码,但这里什么也没有:
I don't believe, indeed, that you can use top_k as an objective function. Just like you can't use the accuracy as an objective function.
我不认为你可以用top_k作为一个目标函数。就像你不能把准确性作为一个目标函数。
The reason is mathematical. Even if keras, tensorflow, theano and co. are awesome tools for AI and allows everybody to fiddle with neural nets, the latters still remain very complex mathematical tools. Those maths are well hidden under the hood but you should be aware of them when trying to go further than prefabricated tools.
原因是数学。即使keras, tensorflow, theano和公司都是人工智能的好工具,让每个人都可以摆弄神经网络,但这些工具仍然是非常复杂的数学工具。这些数学都隐藏在引擎盖下,但当你想要比预制工具更进一步时,你应该意识到它们。
What happens when you train a network is that you compute how wrong the network is on an example and you backpropagate that error to learn from it. The algorithms behind that backpropagation are optimizers, more precisely they are gradient based optimizers. Computing a gradient requires to differentiate the function that we are optimizing, the loss/objective function. It means that the objective must be differentiable. The accuracy isn't a differentiable function, it takes as an input a real number between 0 and 1 and outputs a step-like function : 0 if x<0.5 and 1 if x>0.5. That function isnt differentiable because we can't get its gradient in 0.5. The top_k function is some kind of accuracy function. So indeed in my opinion you cannot use it in an objective, because under the hood, the smart tensorflow has to compute the gradients of your function.
当你训练一个网络时,你会计算出这个网络是如何错误的,你会反向传播那个错误,从中吸取教训。反向传播的算法是优化器,更精确地说,它们是基于梯度的优化器。计算梯度需要区分我们正在优化的函数,损失/目标函数。这意味着目标必须是可微的。精度不是一个可微的函数,它需要一个实数在0和1之间的输入,输出一个类似于步的函数:如果x<0。5和1,如果x>。0。这个函数是不可微的,因为它的梯度是0。5。top_k函数是一种精度函数。因此,在我看来,你不能在一个目标中使用它,因为在引擎盖下,智能的tensorflow必须计算你的函数的梯度。
I hope this helps :)
我希望这对你有帮助:)
#1
3
This might be voted down as I won't really fix your code but here goes nothing :
这可能会被否决,因为我不会真的修改你的代码,但这里什么也没有:
I don't believe, indeed, that you can use top_k as an objective function. Just like you can't use the accuracy as an objective function.
我不认为你可以用top_k作为一个目标函数。就像你不能把准确性作为一个目标函数。
The reason is mathematical. Even if keras, tensorflow, theano and co. are awesome tools for AI and allows everybody to fiddle with neural nets, the latters still remain very complex mathematical tools. Those maths are well hidden under the hood but you should be aware of them when trying to go further than prefabricated tools.
原因是数学。即使keras, tensorflow, theano和公司都是人工智能的好工具,让每个人都可以摆弄神经网络,但这些工具仍然是非常复杂的数学工具。这些数学都隐藏在引擎盖下,但当你想要比预制工具更进一步时,你应该意识到它们。
What happens when you train a network is that you compute how wrong the network is on an example and you backpropagate that error to learn from it. The algorithms behind that backpropagation are optimizers, more precisely they are gradient based optimizers. Computing a gradient requires to differentiate the function that we are optimizing, the loss/objective function. It means that the objective must be differentiable. The accuracy isn't a differentiable function, it takes as an input a real number between 0 and 1 and outputs a step-like function : 0 if x<0.5 and 1 if x>0.5. That function isnt differentiable because we can't get its gradient in 0.5. The top_k function is some kind of accuracy function. So indeed in my opinion you cannot use it in an objective, because under the hood, the smart tensorflow has to compute the gradients of your function.
当你训练一个网络时,你会计算出这个网络是如何错误的,你会反向传播那个错误,从中吸取教训。反向传播的算法是优化器,更精确地说,它们是基于梯度的优化器。计算梯度需要区分我们正在优化的函数,损失/目标函数。这意味着目标必须是可微的。精度不是一个可微的函数,它需要一个实数在0和1之间的输入,输出一个类似于步的函数:如果x<0。5和1,如果x>。0。这个函数是不可微的,因为它的梯度是0。5。top_k函数是一种精度函数。因此,在我看来,你不能在一个目标中使用它,因为在引擎盖下,智能的tensorflow必须计算你的函数的梯度。
I hope this helps :)
我希望这对你有帮助:)