一篇简单的学习笔记,实现五类花分类,这里只介绍复现的一些细节
如果想了解更多有关网络的细节,请去看论文《VERY DEEP CONVOLUTIONAL NETWORKS FOR LARGE-SCALE IMAGE RECOGNITION》
简单说明下数据集,下载链接,这里用的数据与AlexNet的那篇是一样的所以不在说明
一、环境准备
可以去看之前的一篇博客,里面写的很详细了,并且推荐了一篇炮哥的环境搭建环境
- Anaconda3(建议使用)
- python=3.6/3.7/3.8
- pycharm (IDE)
- pytorch=1.11.0 (pip package)
- torchvision=0.12.0 (pip package)
- cudatoolkit=11.3
二、模型搭建、训练
1.整体框图
模型输入为224*224,采用的预处理方式:从每个像素中减去在训练集上计算的RGB均值
vgg11层到19层的结构
其中最常用的是VGG-16,在本文中用的也是16层的D网络,全是步长为3的卷积
计算层数:只计算有参数的层,池化层没参数不计入这里16=13(卷积层)+3(全连接)
总结:
1.局部相应归一化LRN对模型没有改善,A与A-LRN比较
2.1×1的卷积核带来非线性函数有帮助(C优于B),但也可以用(non-trivial receptive fields)来代替,非平凡,无法证明
3.具有小滤波器的深层网络优于具有较大滤波器的浅层网络。
4.深度越深
2.net.py
网络整体结构代码
1 #迁移学习,使用vgg与训练权重vgg16.pth 2 import torch.nn as nn 3 import torch 4 5 # official pretrain weights 6 model_urls = { 7 'vgg11': 'https://download.pytorch.org/models/vgg11-bbd30ac9.pth', 8 'vgg13': 'https://download.pytorch.org/models/vgg13-c768596a.pth', 9 'vgg16': 'https://download.pytorch.org/models/vgg16-397923af.pth', 10 'vgg19': 'https://download.pytorch.org/models/vgg19-dcbb9e9d.pth' 11 } 12 13 class VGG(nn.Module): 14 def __init__(self, features, num_classes=1000, init_weights=False): 15 super(VGG, self).__init__() 16 self.features = features 17 self.classifier = nn.Sequential( 18 nn.Linear(512*7*7, 4096), 19 nn.ReLU(True), 20 nn.Dropout(p=0.5), 21 nn.Linear(4096, 4096), 22 nn.ReLU(True), 23 nn.Dropout(p=0.5), 24 nn.Linear(4096, num_classes) 25 ) 26 if init_weights: 27 self._initialize_weights() 28 29 def forward(self, x): 30 # N x 3 x 224 x 224 31 x = self.features(x) 32 # N x 512 x 7 x 7 33 x = torch.flatten(x, start_dim=1) 34 # N x 512*7*7 35 x = self.classifier(x) 36 return x 37 38 def _initialize_weights(self): 39 for m in self.modules(): 40 if isinstance(m, nn.Conv2d): 41 # nn.init.kaiming_normal_(m.weight, mode='fan_out', nonlinearity='relu') 42 nn.init.xavier_uniform_(m.weight) 43 if m.bias is not None: 44 nn.init.constant_(m.bias, 0) 45 elif isinstance(m, nn.Linear): 46 nn.init.xavier_uniform_(m.weight) 47 # nn.init.normal_(m.weight, 0, 0.01) 48 nn.init.constant_(m.bias, 0) 49 50 51 def make_features(cfg: list): 52 layers = [] 53 in_channels = 3 54 for v in cfg: 55 if v == "M": 56 layers += [nn.MaxPool2d(kernel_size=2, stride=2)] 57 else: 58 conv2d = nn.Conv2d(in_channels, v, kernel_size=3, padding=1) 59 layers += [conv2d, nn.ReLU(True)] 60 in_channels = v 61 return nn.Sequential(*layers) 62 63 64 cfgs = { 65 'vgg11': [64, 'M', 128, 'M', 256, 256, 'M', 512, 512, 'M', 512, 512, 'M'], 66 'vgg13': [64, 64, 'M', 128, 128, 'M', 256, 256, 'M', 512, 512, 'M', 512, 512, 'M'], 67 'vgg16': [64, 64, 'M', 128, 128, 'M', 256, 256, 256, 'M', 512, 512, 512, 'M', 512, 512, 512, 'M'], 68 'vgg19': [64, 64, 'M', 128, 128, 'M', 256, 256, 256, 256, 'M', 512, 512, 512, 512, 'M', 512, 512, 512, 512, 'M'], 69 } 70 71 72 def vgg(model_name="vgg16", **kwargs): 73 assert model_name in cfgs, "Warning: model number {} not in cfgs dict!".format(model_name) 74 cfg = cfgs[model_name] 75 76 model = VGG(make_features(cfg), **kwargs) 77 return model 78 if __name__ =="__main__": 79 x = torch.rand([1, 3, 224, 224]) 80 model = vgg(num_classes=5) 81 y = model(x) 82 #print(y) 83 84 #统计模型参数 85 sum = 0 86 for name, param in model.named_parameters(): 87 num = 1 88 for size in param.shape: 89 num *= size 90 sum += num 91 #print("{:30s} : {}".format(name, param.shape)) 92 print("total param num {}".format(sum))#total param num 134,281,029