I'm going crazy in this project. This is multi-label text-classification with lstm in keras. My model is this:
我在这个项目中疯了。这是在keras中使用lstm的多标签文本分类。我的模型是这样的:
model = Sequential()
model.add(Embedding(max_features, embeddings_dim, input_length=max_sent_len, mask_zero=True, weights=[embedding_weights] ))
model.add(Dropout(0.25))
model.add(LSTM(output_dim=embeddings_dim , activation='sigmoid', inner_activation='hard_sigmoid', return_sequences=True))
model.add(Dropout(0.25))
model.add(LSTM(activation='sigmoid', units=embeddings_dim, recurrent_activation='hard_sigmoid', return_sequences=False))
model.add(Dropout(0.25))
model.add(Dense(num_classes))
model.add(Activation('sigmoid'))
adam=keras.optimizers.Adam(lr=0.04)
model.compile(optimizer=adam, loss='categorical_crossentropy', metrics=['accuracy'])
Only that I have too low an accuracy .. with the binary-crossentropy I get a good accuracy, but the results are wrong !!!!! changing to categorical-crossentropy, I get very low accuracy. Do you have any suggestions?
只有我的准确度太低了..用二进制 - 交叉熵我得到了很好的准确性,但结果是错误的!!!!!改为分类 - 交叉熵,我的准确度非常低。你有什么建议吗?
there is my code: GitHubProject - Multi-Label-Text-Classification
有我的代码:GitHubProject - 多标签文本分类
2 个解决方案
#1
2
In last layer, the activation function you are using is sigmoid
, so binary_crossentropy
should be used. Incase you want to use categorical_crossentropy
then use softmax
as activation function in last layer.
在最后一层,您使用的激活函数是sigmoid,因此应该使用binary_crossentropy。如果你想使用categorical_crossentropy,那么在最后一层使用softmax作为激活函数。
Now, coming to the other part of your model, since you are working with text, i would tell you to go for tanh
as activation function in LSTM layers.
现在,来到你的模型的另一部分,因为你正在处理文本,我会告诉你在LSTM层中作为激活函数使用tanh。
And you can try using LSTM's dropouts as well like dropout
and recurrent dropout
你可以尝试使用LSTM的辍学,以及辍学和经常性辍学
LSTM(units, dropout=0.2, recurrent_dropout=0.2,
activation='tanh')
You can define units as 64
or 128
. Start from small number and after testing you take them till 1024
.
您可以将单位定义为64或128.从小数字开始,测试后将它们带到1024。
You can try adding convolution
layer as well for extracting features or use Bidirectional LSTM
But models based Bidirectional
takes time to train.
您可以尝试添加卷积层以提取特征或使用双向LSTM但基于模型的双向需要时间来训练。
Moreover, since you are working on text, pre-processing of text and size of training data
always play much bigger role than expected.
此外,由于您正在处理文本,因此文本的预处理和训练数据的大小始终比预期发挥更大的作用。
Edited
编辑
Add Class weights in fit parameter
在fit参数中添加类权重
class_weights = class_weight.compute_class_weight('balanced',
np.unique(labels),
labels)
class_weights_dict = dict(zip(le.transform(list(le.classes_)),
class_weights))
model.fit(x_train, y_train, validation_split, class_weight=class_weights_dict)
#2
1
change:
更改:
model.add(Activation('sigmoid'))
to:
至:
model.add(Activation('softmax'))
#1
2
In last layer, the activation function you are using is sigmoid
, so binary_crossentropy
should be used. Incase you want to use categorical_crossentropy
then use softmax
as activation function in last layer.
在最后一层,您使用的激活函数是sigmoid,因此应该使用binary_crossentropy。如果你想使用categorical_crossentropy,那么在最后一层使用softmax作为激活函数。
Now, coming to the other part of your model, since you are working with text, i would tell you to go for tanh
as activation function in LSTM layers.
现在,来到你的模型的另一部分,因为你正在处理文本,我会告诉你在LSTM层中作为激活函数使用tanh。
And you can try using LSTM's dropouts as well like dropout
and recurrent dropout
你可以尝试使用LSTM的辍学,以及辍学和经常性辍学
LSTM(units, dropout=0.2, recurrent_dropout=0.2,
activation='tanh')
You can define units as 64
or 128
. Start from small number and after testing you take them till 1024
.
您可以将单位定义为64或128.从小数字开始,测试后将它们带到1024。
You can try adding convolution
layer as well for extracting features or use Bidirectional LSTM
But models based Bidirectional
takes time to train.
您可以尝试添加卷积层以提取特征或使用双向LSTM但基于模型的双向需要时间来训练。
Moreover, since you are working on text, pre-processing of text and size of training data
always play much bigger role than expected.
此外,由于您正在处理文本,因此文本的预处理和训练数据的大小始终比预期发挥更大的作用。
Edited
编辑
Add Class weights in fit parameter
在fit参数中添加类权重
class_weights = class_weight.compute_class_weight('balanced',
np.unique(labels),
labels)
class_weights_dict = dict(zip(le.transform(list(le.classes_)),
class_weights))
model.fit(x_train, y_train, validation_split, class_weight=class_weights_dict)
#2
1
change:
更改:
model.add(Activation('sigmoid'))
to:
至:
model.add(Activation('softmax'))