在Tensorflow中实施Logistic回归(LR),但在使用偏倚时,结果看起来很奇怪,但没有偏见。

时间:2021-11-21 13:51:00

I tried the LR for sparse dataset using the tensorflow package(0.7.0)
The following is part of my procedure:

我使用tensorflow软件包(0.7.0)尝试使用LR来进行稀疏数据集,以下是我的程序的一部分:

weight_values=generateWeight([trainset.feature_num,1],name='weight')  
bias=init_bias([1,1],name='bias')  
sp_shape=tf.placeholder(tf.int64)  
sp_indices=tf.placeholder(tf.int64)  
sp_ids_value=tf.placeholder(tf.int64)  
sp_features_value=tf.placeholder(tf.float32)  
Y=tf.placeholder('float',name='Y')  

sp_ids=tf.SparseTensor(sp_indices,sp_ids_value,sp_shape)
sp_values=tf.SparseTensor(sp_indices,sp_features_value,sp_shape)
\#Z=tf.nn.embedding_lookup_sparse(weight_values,sp_ids,sp_values);
Z_b=tf.nn.embedding_lookup_sparse(weight_values,sp_ids,sp_values)+bias  
predict_op=tf.sigmoid(Z_b,name='result')   
\#cost=tf.nn.sigmoid_cross_entropy_with_logits(Z,Y)
cost=tf.nn.sigmoid_cross_entropy_with_logits(Z_b,Y)
train_op=tf.train.GradientDescentOptimizer(learning_rate).minimize(cost)

...

but I find the results is wired when using the bias,using the bias I find the abs(cost) became larger and larger and even the abs(bias) also became larger and larger.

但是我发现当使用偏差时,结果是有联系的,使用偏见我发现abs(成本)变得越来越大,甚至abs(偏见)也变得越来越大。

IN the early iterations the results like this :
1 labels[-1], j-sample 1662, i-bias 0.100000 cost 0.761811, z_b 0.045602
2 labels[1], j-sample 823, i-bias 0.084886 cost 0.653277, z_b 0.081396
3 labels[-1], j-sample 20802, i-bias 0.089683 cost 0.826316, z_b 0.088132
4 labels[-1], j-sample 25965, i-bias 0.074462 cost 0.806052, z_b 0.074804
5 labels[1], j-sample 10322, i-bias 0.059276 cost 0.664358, z_b 0.058433
6 labels[-1], j-sample 23946, i-bias 0.064129 cost 0.795182, z_b 0.067642
......

在早期的迭代结果如下:1标签[1],j-sample 1662年,影响成本0.100000 0.100000,z_b 0.045602 - 2标签[1],j-sample 823年,影响成本0.084886 0.084886,z_b 0.081396 - 3标签[1],j-sample 20802年,影响成本0.089683 0.089683,z_b 0.088132 - 4标签[1],j-sample 25965年,影响成本0.074462 0.074462,z_b 0.074804 - 5标签[1],j-sample 10322年,影响成本0.059276 0.059276,0.058433 z_b 6标签[1],j-sample 23946年,影响成本0.064129 0.064129,0.067642 z_b……

but with more iteration the results like this :
270504 labels[-1], j-sample 446, i-bias -248.318787 cost -250.818787, z_b -250.818787
270505 labels[1], j-sample 10314, i-bias -248.328781 cost 248.306259, z_b -248.306259
270506 labels[1], j-sample 3820, i-bias -248.318787 cost 247.367340, z_b -247.367340
270507 labels[1], j-sample 2922, i-bias -248.308792 cost 248.276184, z_b -248.276184
270508 labels[-1], j-sample 20797, i-bias -248.298798 cost -255.061432, z_b -255.061432
270509 labels[-1], j-sample 19755, i-bias -248.308792 cost -251.686646, z b -251.686646
270510 labels[1], j-sample 9528, i-bias -248.318787 cost 248.405624, z b 248.405624

但随着更多的迭代,结果如下:270504标签[1],j-sample 446,影响成本-248.318787 -248.318787,-250.818787 z_b 270505标签[1],10314年j-sample,影响成本-248.328781 -248.328781,-248.306259 z_b 270506标签[1],3820年j-sample,影响成本-248.318787 -248.318787,-247.367340 z_b 270507标签[1],2922年j-sample,影响成本-248.308792 -248.308792,-248.276184 z_b 270508标签[1],20797年j-sample,影响成本-248.298798 -248.298798,-255.061432 z_b 270509标签[1],j-sample 19755,i-bias -248.308792 cost -251.686646, zb -251.686646 270510标签[1],j-sample 9528, i-bias -248.318787,成本248.405624,zb 248.405624。

However if I donot use the bias the results looks good, what maybe the problems and I wonder the reasons very much.

然而,如果我不使用偏差,结果看起来很好,可能是什么问题,我很想知道原因。

Can anynone help me, Thank you!

谁都不能帮助我,谢谢!

1 个解决方案

#1


0  

I think the use of “embedding_lookup_sparse” is wrong. Look at thissource code of embedding_lookup_sparse

我认为“嵌入式浏览器”的使用是错误的。看看这个嵌入的源代码。

the default value of combiner is "mean", but if you want to get the w * x, which x is a sparse data, you should set the combiner is "sum", not the "mean". So I think this is your main problem.

组合器的默认值是“mean”,但是如果您想要得到w * x,而x是一个稀疏数据,那么您应该将组合器设置为“sum”,而不是“mean”。所以我认为这是你的主要问题。

#1


0  

I think the use of “embedding_lookup_sparse” is wrong. Look at thissource code of embedding_lookup_sparse

我认为“嵌入式浏览器”的使用是错误的。看看这个嵌入的源代码。

the default value of combiner is "mean", but if you want to get the w * x, which x is a sparse data, you should set the combiner is "sum", not the "mean". So I think this is your main problem.

组合器的默认值是“mean”,但是如果您想要得到w * x,而x是一个稀疏数据,那么您应该将组合器设置为“sum”,而不是“mean”。所以我认为这是你的主要问题。