前言
之前很多人在,如何进行XXX的识别,对应的神经网络如何搭建。对应神经网络怎么搭建,我也是照本宣科,只能说看得懂而已,没有对这块进行深入的研究,但是现在tensorflow,paddle这些工具,都提供了非常成熟的神经网络进行直接使用。
本文对过往的一些文章进行改造,使用已经集成的神经网络,简单的实现多个种类的动物识别。
环境
tensorflow:2.9.1
keras:2.9.0
os:windows10
gpu:RTX3070
cuda:cuda_11.4.r11.4
如何安装tensorflow就不在做赘述,要重点说明 tensorflow与keras版本的不同会引起不同工具类的使用。
数据准备
链接: https://pan.baidu.com/s/1J7yRsTS2o0LcVkbKKJD-Bw 提取码: 6666
解压之后,结构如下
代码
一、模型训练代码(animalv2_model_train.py)
导入
import os
import plotly.express as px
import matplotlib.pyplot as plt
from IPython.display import clear_output as cls
import numpy as np
from glob import glob
import pandas as pd
# Model
import keras
from keras.models import Sequential, load_model
from keras.layers import GlobalAvgPool2D as GAP, Dense, Dropout
from keras.preprocessing.image import ImageDataGenerator
# Callbacks
from keras.callbacks import EarlyStopping, ModelCheckpoint
# 模型与处理工具
import tensorflow as tf
from tensorflow.keras.applications import ResNet50V2
from tensorflow.keras.utils import load_img, img_to_array
数据集合处理
root_path = './animal/Animals_Classification/Animal-Data-V2/Data-V2/Training Data/'
valid_path = './animal/Animals_Classification/Animal-Data-V2/Data-V2/Validation Data/'
test_path = './animal/Animals_Classification/Animal-Data-V2/Data-V2/Testing Data/'
# 动物种类
class_names = sorted(os.listdir(root_path))
n_classes = len(class_names)
print(f"Total Number of Classes : {n_classes} \nClass Names : {class_names}")
class_dis = [len(os.listdir(root_path+name)) for name in class_names]
fig = px.pie(names=class_names, values=class_dis, title="Training Class Distribution", hole=0.4)
fig.update_layout({'title':{'x':0.48}})
fig.show()
fig = px.bar(x=class_names, y=class_dis, title="Training Class Distribution", color=class_names)
fig.update_layout({'title':{'x':0.48}})
fig.show()
# 归一化
train_gen = ImageDataGenerator(rescale=1/255., rotation_range=10, horizontal_flip=True)
valid_gen = ImageDataGenerator(rescale=1/255.)
test_gen = ImageDataGenerator(rescale=1/255)
# Load Data
train_ds = train_gen.flow_from_directory(root_path, class_mode='binary', target_size=(256,256), shuffle=True, batch_size=32)
valid_ds = valid_gen.flow_from_directory(valid_path, class_mode='binary', target_size=(256,256), shuffle=True, batch_size=32)
test_ds = test_gen.flow_from_directory(test_path, class_mode='binary', target_size=(256,256), shuffle=True, batch_size=32)
结果如下:
Total Number of Classes : 10
Class Names : ['Cat', 'Cow', 'Dog', 'Elephant', 'Gorilla', 'Hippo', 'Monkey', 'Panda', 'Tiger', 'Zebra']
Found 20000 images belonging to 10 classes.
Found 1000 images belonging to 10 classes.
Found 1907 images belonging to 10 classes.
图片展示
def show_images(GRID=[5, 5], model=None, size=(20, 20), data=train_ds):
n_rows = GRID[0]
n_cols = GRID[1]
n_images = n_cols * n_rows
i = 1
plt.figure(figsize=size)
for images, labels in data:
id = np.random.randint(len(images))
image, label = images[id], class_names[int(labels[id])]
plt.subplot(n_rows, n_cols, i)
plt.imshow(image)
if model is None:
title = f"Class : {label}"
else:
pred = class_names[int(np.argmax(model.predict(image[np.newaxis, ...])))]
title = f"Org : {label}, Pred : {pred}"
cls()
plt.title(title)
plt.axis('off')
i += 1
if i >= (n_images + 1):
break
plt.tight_layout()
plt.show()
def load_image(path):
image = tf.cast(tf.image.resize(img_to_array(load_img(path))/255., (256,256)), tf.float32)
return image
def show_image(image, title=None):
plt.imshow(image)
plt.axis('off')
plt.title(title)
show_images(data=train_ds)
show_images(data=valid_ds)
show_images(data=test_ds)
path = './animal/Animals_Classification/Animal-Data-V2/Data-V2/Interesting Data/'
interesting_images = [glob(path + name + "/*") for name in class_names]
# Interesting Cat Images
for name in class_names:
plt.figure(figsize=(25, 8))
cat_interesting = interesting_images[class_names.index(name)]
for i, i_path in enumerate(cat_interesting):
name = i_path.split("/")[-1].split(".")[0]
image = load_image(i_path)
plt.subplot(1,len(cat_interesting),i+1)
show_image(image, title=name.title())
plt.show()
模型训练
with tf.device("/GPU:0"):
## 定义网络
base_model = ResNet50V2(input_shape=(256,256,3), include_top=False)
base_model.trainable = False
cls()
# 设计参数
name = "ResNet50V2"
model = Sequential([
base_model,
GAP(),
Dense(256, activation='relu', kernel_initializer='he_normal'),
Dropout(0.2),
Dense(n_classes, activation='softmax')
], name=name)
# Callbacks
# 容忍度为3,在容忍度之内就结束训练
cbs = [EarlyStopping(patience=3, restore_best_weights=True), ModelCheckpoint(name + "_V2.h5", save_best_only=True)]
# Model
opt = tf.keras.optimizers.Adam(learning_rate=2e-3)
model.compile(loss='sparse_categorical_crossentropy', optimizer=opt, metrics=['accuracy'])
# Model Training
history = model.fit(train_ds, validation_data=valid_ds, callbacks=cbs, epochs=50)
data = pd.DataFrame(history.history)
模型训练
运行上面代码,我电脑的配置差不多需要1700+s(PS:可以换一下内存大一些的显卡比如 RTX40XX )
执行结果为如下:
2022-11-29 17:43:01.082836: I tensorflow/core/platform/cpu_feature_guard.cc:193] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN) to use the following CPU instructions in performance-critical operations: AVX AVX2
To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags.
2022-11-29 17:43:01.449655: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1532] Created device /job:localhost/replica:0/task:0/device:GPU:0 with 5472 MB memory: -> device: 0, name: NVIDIA GeForce RTX 3070, pci bus id: 0000:01:00.0, compute capability: 8.6
Epoch 1/50
2022-11-29 17:43:18.284528: I tensorflow/stream_executor/cuda/cuda_dnn.cc:384] Loaded cuDNN version 8204
2022-11-29 17:43:21.378441: I tensorflow/stream_executor/cuda/cuda_blas.cc:1786] TensorFloat-32 will be used for the matrix multiplication. This will only be logged once.
625/625 [==============================] - 292s 457ms/step - loss: 0.2227 - accuracy: 0.9361 - val_loss: 0.1201 - val_accuracy: 0.9630
Epoch 2/50
625/625 [==============================] - 217s 348ms/step - loss: 0.1348 - accuracy: 0.9596 - val_loss: 0.1394 - val_accuracy: 0.9610
Epoch 3/50
625/625 [==============================] - 218s 349ms/step - loss: 0.1193 - accuracy: 0.9641 - val_loss: 0.1452 - val_accuracy: 0.9620
Epoch 4/50
625/625 [==============================] - 219s 350ms/step - loss: 0.1035 - accuracy: 0.9690 - val_loss: 0.1147 - val_accuracy: 0.9690
Epoch 5/50
625/625 [==============================] - 221s 354ms/step - loss: 0.0897 - accuracy: 0.9736 - val_loss: 0.1117 - val_accuracy: 0.9730
Epoch 6/50
625/625 [==============================] - 219s 351ms/step - loss: 0.0817 - accuracy: 0.9747 - val_loss: 0.1347 - val_accuracy: 0.9640
Epoch 7/50
625/625 [==============================] - 219s 351ms/step - loss: 0.0818 - accuracy: 0.9740 - val_loss: 0.1126 - val_accuracy: 0.9700
Epoch 8/50
625/625 [==============================] - 219s 350ms/step - loss: 0.0731 - accuracy: 0.9785 - val_loss: 0.1366 - val_accuracy: 0.9680
验证模型
验证模型代码(animalv2_model_evaluate.py)
from keras.models import load_model
import tensorflow as tf
from tensorflow.keras.utils import load_img, img_to_array
import numpy as np
import os
import matplotlib.pyplot as plt
root_path = './animal/Animals_Classification/Animal-Data-V2/Data-V2/Training Data/'
class_names = sorted(os.listdir(root_path))
model = load_model('./ResNet50V2_V2.h5')
model.summary()
def load_image(path):
image = tf.cast(tf.image.resize(img_to_array(load_img(path))/255., (256,256)), tf.float32)
return image
i_path = './animal/Animals_Classification/Animal-Data-V2/Data-V2/Validation Data/Gorilla/Gorilla (3).jpeg'
image = load_image(i_path)
preds = model.predict(image[np.newaxis, ...])[0]
print(preds)
pred_class = class_names[np.argmax(preds)]
confidence_score = np.round(preds[np.argmax(preds)], 2)
# Configure Title
title = f"Pred : {pred_class}\nConfidence : {confidence_score:.2}"
print(title)
plt.figure(figsize=(25, 8))
plt.title(title)
plt.imshow(image)
plt.show()
while True:
path = input("input:")
if (path == "q!"):
exit()
image = load_image(path)
preds = model.predict(image[np.newaxis, ...])[0]
print(preds)
pred_class = class_names[np.argmax(preds)]
confidence_score = np.round(preds[np.argmax(preds)], 2)
# Configure Title
title = f"Pred : {pred_class}\nConfidence : {confidence_score:.2}"
print(title)
plt.figure(figsize=(25, 8))
plt.title(title)
plt.imshow(image)
plt.show()
验证结果
Model: "ResNet50V2"
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
resnet50v2 (Functional) (None, 8, 8, 2048) 23564800
global_average_pooling2d (G (None, 2048) 0
lobalAveragePooling2D)
dense (Dense) (None, 256) 524544
dropout (Dropout) (None, 256) 0
dense_1 (Dense) (None, 10) 2570
=================================================================
Total params: 24,091,914
Trainable params: 527,114
Non-trainable params: 23,564,800
_________________________________________________________________
2022-11-29 20:33:15.981925: I tensorflow/stream_executor/cuda/cuda_dnn.cc:384] Loaded cuDNN version 8204
2022-11-29 20:33:18.070138: I tensorflow/stream_executor/cuda/cuda_blas.cc:1786] TensorFloat-32 will be used for the matrix multiplication. This will only be logged once.
1/1 [==============================] - 3s 3s/step
[1.2199847e-09 1.0668253e-12 6.8980124e-13 1.0352933e-08 9.9999988e-01
4.1255888e-09 7.1100374e-08 3.0439090e-10 3.1216061e-11 2.8051938e-12]
Pred : Gorilla
Confidence : 1.0
做了一个input的能力,可以通过本地的图片地址进行验证
input:./animal/Animals_Classification/Animal-Data-V2/Data-V2/Validation Data/Zebra/Zebra-Valid (276).jpeg
1/1 [==============================] - 0s 21ms/step
[1.5658158e-12 1.6018555e-10 9.6812911e-13 6.2212702e-10 5.4042397e-09
5.8055113e-05 4.7865592e-12 3.4024495e-12 3.0037000e-08 9.9994195e-01]
Pred : Zebra
Confidence : 1.0