【Keras学习笔记】5:Softmax多分类预测Iris鸢尾花数据集(顺序编码)

时间:2024-03-23 15:19:27

读入数据和预处理

import keras
from keras import layers
import numpy as np
import pandas as pd
from matplotlib import pyplot as plt
%matplotlib inline
Using TensorFlow backend.
df = pd.read_csv('./data/Iris.csv')
df.head()
Id SepalLengthCm SepalWidthCm PetalLengthCm PetalWidthCm Species
0 1 5.1 3.5 1.4 0.2 Iris-setosa
1 2 4.9 3.0 1.4 0.2 Iris-setosa
2 3 4.7 3.2 1.3 0.2 Iris-setosa
3 4 4.6 3.1 1.5 0.2 Iris-setosa
4 5 5.0 3.6 1.4 0.2 Iris-setosa
df.Species.unique()
array(['Iris-setosa', 'Iris-versicolor', 'Iris-virginica'], dtype=object)
# 将其映射到0,1,2上
spec_dict = {'Iris-setosa':0, 'Iris-versicolor':1, 'Iris-virginica':2}
df['Species'] = df.Species.map(spec_dict)
df.head()
Id SepalLengthCm SepalWidthCm PetalLengthCm PetalWidthCm Species
0 1 5.1 3.5 1.4 0.2 0
1 2 4.9 3.0 1.4 0.2 0
2 3 4.7 3.2 1.3 0.2 0
3 4 4.6 3.1 1.5 0.2 0
4 5 5.0 3.6 1.4 0.2 0

打乱:

# 生成该区间的随意唯一索引
index = np.random.permutation(len(df))
# 用生成的乱的索引就能将其打乱了
df = df.iloc[index ,:]

划分x和y:

x = df.iloc[:, 1:-1]
y = df.Species
x.head(), y.head()
(     SepalLengthCm  SepalWidthCm  PetalLengthCm  PetalWidthCm
 89             5.5           2.5            4.0           1.3
 10             5.4           3.7            1.5           0.2
 149            5.9           3.0            5.1           1.8
 88             5.6           3.0            4.1           1.3
 17             5.1           3.5            1.4           0.3, 89     1
 10     0
 149    2
 88     1
 17     0
 Name: Species, dtype: int64)

建立模型

注意,无论使用one-hot编码还是这样的顺序编码,Softmax多分类的输出都是类别的数目,虽然这里标签只有一列,但是模型的输出仍然设置成3维的。因为要对每个Logits做Softmax运算。

model = keras.Sequential()
model.add(layers.Dense(3, input_dim=4, activation='softmax'))
WARNING:tensorflow:From E:\MyProgram\Anaconda\envs\krs\lib\site-packages\tensorflow\python\framework\op_def_library.py:263: colocate_with (from tensorflow.python.framework.ops) is deprecated and will be removed in a future version.
Instructions for updating:
Colocations handled automatically by placer.
model.summary()
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
=================================================================
dense_1 (Dense)              (None, 3)                 15        
=================================================================
Total params: 15
Trainable params: 15
Non-trainable params: 0
_________________________________________________________________

编译模型

# 注意顺序编码时Loss采用sparse_categorical_crossentropy
model.compile(
    optimizer='adam',
    loss='sparse_categorical_crossentropy',
    metrics=['acc']
)

训练模型

history = model.fit(x, y, epochs=300, verbose=0)
WARNING:tensorflow:From E:\MyProgram\Anaconda\envs\krs\lib\site-packages\tensorflow\python\ops\math_ops.py:3066: to_int32 (from tensorflow.python.ops.math_ops) is deprecated and will be removed in a future version.
Instructions for updating:
Use tf.cast instead.

绘制loss和acc变化曲线

plt.plot(range(300),history.history.get('loss'))
[<matplotlib.lines.Line2D at 0x140f1860>]

【Keras学习笔记】5:Softmax多分类预测Iris鸢尾花数据集(顺序编码)

plt.plot(range(300),history.history.get('acc'))
[<matplotlib.lines.Line2D at 0x14180c18>]

【Keras学习笔记】5:Softmax多分类预测Iris鸢尾花数据集(顺序编码)