系统环境
ubuntu14.04
python2.7
-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
说明:基于cpu环境的py-faster-rcnn具体配置过程可以看我的另一篇文章点击打开链接
下面我将从制作做数据集开始讲解利用py-faster-rcnn训练自己的数据模型的过程。
制作数据集
在制作自己的数据集之前,我们先下载VOC2007数据集。
百度云地址:http://pan.baidu.com/s/1gfdSFRX
解压,然后,将该数据集放在py-faster-rcnn-master\data目录下。(后面你将用你的训练数据集替换VOC2007数据集。(替换Annotations,ImageSets和JPEGImages)
(用你的Annotations,ImagesSets和JPEGImages替换py-faster-rcnn\data\VOCdevkit2007\VOC2007中对应文件夹)
文件结构如下所示:
Annotations中是所有的xml文件
JPEGImages中是所有的训练图片
Main中是4个txt文件,其中test.txt是测试集,train.txt是训练集,val.txt是验证集,trainval.txt是训练和验证集
(一)图片命名
我们需要将自己的数据集做成VOC2007格式用于训练,那么我们应该首先将图片重新命名为“000001.jpg”这种格式,这是VOC2007标准格式。我们首先将训练图片全部放入同一个文件夹下,如我刚开始做测试时将图片放在了下面路径下:/home/wlw/VS_code_projects/cat_dog_picture。下面利用python将这些图片进行批量重命名。
#_*_coding:utf-8 import os pic_path="/home/wlw/VS_code_projects/cat_dog_picture" def rename(): piclist=os.listdir(pic_path) total_num=len(piclist) i=1 for pic in piclist: if pic.endswith(".jpg"): old_path=os.path.join(os.path.abspath(pic_path),pic)#os.path.abspath获得绝对路径 new_path=os.path.join(os.path.abspath(pic_path),'000'+format(str(i),'0>3')+'.jpg') os.renames(old_path,new_path) print u"把原图片命名格式:"+old_path+u"转换为新图片命名格式:"+new_path #print "把原图片路径:%s,转换为新图片路径:%s" %(old_path,new_path) i=i+1 print "总共"+str(total_num)+"张图片被重命名为:" "000001.jpg~"+'000'+format(str(i-1),'0>3')+".jpg形式" rename()
效果如下:
(二)画目标包围框并自动生成XML文件
这里我利用了labelimg工具点击打开链接,可以利用它自定义绘制目标包围框,并自动生成xml文件
(三)利用python将XML文件生成ImageSets\Main里的四个txt文件
txt文件里的内容为:
即图片名(无后缀),test.txt是测试集,train.txt是训练集,val.txt是验证集,trainval.txt是训练和验证集。这里我设定,trainval大概是整个数据集的80%,test也大概是整个数据集的20%;train大概是trainval的80%,val大概是trainval的20%。python 代码如下:
#_*_coding:utf-8 import os import random #import numpy as np #from sklearn.model_selection import train_test_split xmlfilepath="/home/wlw/VS_code_projects/pic_xml" txtsavepath="/home/wlw/VS_code_projects/pic_txt" trainval_percent=0.8 #traincal占整个数据集的80%,剩下的就是test所占的百分比 train_percent=0.8 #train占trainval的百分比,剩下的就是val所占百分比 def xml_to_txt(): xmllist=os.listdir(xmlfilepath)#xml文件列表 xml_num=len(xmllist)#xml文件数量 num_list=range(xml_num)#将xml文件分别用数字表示 # trainval=xmllist[:int(num_xml*train_percent)]#trainval数据集 # test=xmllist[int(num_xml*train_percent):]#test数据集 # trainvalsize=len(trainval)#trainval数据集大小 # train=trainval[:int(trainvalsize*train_percent)]#train数据集 # val=trainval[int(trainvalsize*train_percent):]#val数据集 trainval_num=int(xml_num*train_percent) trainval=random.sample(num_list,trainval_num)#从xml文件中随机选取一部分当作trainval数据集 train_num=int(trainval_num*train_percent) train=random.sample(trainval,train_num)#从trainval文件中随机选取一部分当作train数据集 ftrainval=open(txtsavepath+'/trainval.txt','w') ftest=open(txtsavepath+'/test.txt','w') ftrain=open(txtsavepath+'/train.txt','w') fval=open(txtsavepath+'/val.txt','w') for i in num_list: name=xmllist[i][:-4]+'\n' if i in trainval: ftrainval.write(name) if i in train: ftrain.write(name) else: fval.write(name) else: ftest.write(name) ftrainval.close() ftrain.close() fval.close() ftest.close() xml_to_txt()
这样,数据集就基本做好了,将你的各个文件分别替换掉py-faster-rcnn\data\VOCdevkit2007\VOC2007中对应文件夹。Annotations中是所有的xml文件
JPEGImages中是所有的训练图片
Main中是4个txt文件,其中test.txt是测试集,train.txt是训练集,val.txt是验证集,trainval.txt是训练和验证集。
至此数据集工作全部做好,下面开始做大量训练之前的修改工作。
修改步骤
(1)因为是在cpu环境下进行训练,所以首先打开py-faster-rcnn-master/tools/train_faster_rcnn_alt_opt.py文件:
将34-36行有关于gpu的部分注释掉;将213行cfg.GPU_ID = args.gpu_id也注释掉;
def parse_args(): """ Parse input arguments """ parser = argparse.ArgumentParser(description='Train a Faster R-CNN network') #parser.add_argument('--gpu', dest='gpu_id', # help='GPU device id to use [0]', # default=0, type=int)
将213行cfg.GPU_ID = args.gpu_id也注释掉
if __name__ == '__main__': args = parse_args() print('Called with args:') print(args) if args.cfg_file is not None: cfg_from_file(args.cfg_file) if args.set_cfgs is not None: cfg_from_list(args.set_cfgs) #cfg.GPU_ID = args.gpu_id
将102行caffe.set_mode_gpu()改为cpu
def _init_caffe(cfg): """Initialize pycaffe in a training process. """ import caffe # fix the random seeds (numpy and caffe) for reproducibility np.random.seed(cfg.RNG_SEED) caffe.set_random_seed(cfg.RNG_SEED) # set up caffe caffe.set_mode_cpu() #caffe.set_device(cfg.GPU_ID)
(2)打开py-faster-rcnn-master/experiments/scripts/faster_rcnn_alt_opt.sh文件,将其中关于gpu的内容注释掉,从46行到最后,修改如下:
#time ./tools/train_faster_rcnn_alt_opt.py --gpu ${GPU_ID} \ time cd /home/wlw/Downloads/py-faster-rcnn-master/tools/ python train_faster_rcnn_alt_opt.py --net_name ${NET} --weights data/imagenet_models/${NET}.v2.caffemodel \ --imdb ${TRAIN_IMDB} \ --cfg experiments/cfgs/faster_rcnn_alt_opt.yml \ ${EXTRA_ARGS} set +x NET_FINAL=`grep "Final model:" ${LOG} | awk '{print $3}'` set -x #time ./tools/test_net.py --gpu ${GPU_ID} \ time cd /home/wlw/Downloads/py-faster-rcnn-master/tools/ test_net.py --def /home/wlw/Downloads/py-faster-rcnn-master/models/${PT_DIR}/${NET}/faster_rcnn_alt_opt/faster_rcnn_test.pt \ --net ${NET_FINAL} \ --imdb ${TEST_IMDB} \ --cfg experiments/cfgs/faster_rcnn_alt_opt.yml \ ${EXTRA_ARGS}
至此关于将gpu相关内容修改为cpu就完成了,下面开始对训练前的一些内容进行修改。
(3)py-faster-rcnn-master/models/pascal_voc/ZF/faster_rcnn_alt_opt/stage1_fast_rcnn_train.pt文件,修改:
layer { name: 'data' type: 'Python' top: 'data' top: 'rois' top: 'labels' top: 'bbox_targets' top: 'bbox_inside_weights' top: 'bbox_outside_weights' python_param { module: 'roi_data_layer.layer' layer: 'RoIDataLayer' param_str: "'num_classes':3" #按训练集类别改,该值为类别数+1 } }
layer {
layer { name: "bbox_pred" type: "InnerProduct" bottom: "fc7" top: "bbox_pred" param { lr_mult: 1.0 } param { lr_mult: 2.0 } inner_product_param { num_output: 12 #按训练集类别改,该值为(类别数+1)*4 weight_filler { type: "gaussian" std: 0.001 } bias_filler { type: "constant" value: 0 } } }
name: "cls_score" type: "InnerProduct" bottom: "fc7" top: "cls_score" param { lr_mult: 1.0 } param { lr_mult: 2.0 } inner_product_param { num_output: 3 #按训练集类别改,该值为类别数+1 weight_filler { type: "gaussian" std: 0.01 } bias_filler { type: "constant" value: 0 } } }
(4)py-faster-rcnn-master/models/pascal_voc/ZF/faster_rcnn_alt_opt/stage1_rpn_train.pt修改
layer { name: 'input-data' type: 'Python' top: 'data' top: 'im_info' top: 'gt_boxes' python_param { module: 'roi_data_layer.layer' layer: 'RoIDataLayer' param_str: "'num_classes': 3" #按训练集类别改,该值为类别数+1 } }
(5)py-faster-rcnn-master/models/pascal_voc/ZF/faster_rcnn_alt_opt/stage2_fast_rcnn_train.pt修改
layer { name: 'data' type: 'Python' top: 'data' top: 'rois' top: 'labels' top: 'bbox_targets' top: 'bbox_inside_weights' top: 'bbox_outside_weights' python_param { module: 'roi_data_layer.layer' layer: 'RoIDataLayer' param_str: "'num_classes': 3" #按训练集类别改,该值为类别数+1 } }
layer { name: "cls_score" type: "InnerProduct" bottom: "fc7" top: "cls_score" param { lr_mult: 1.0 } param { lr_mult: 2.0 } inner_product_param { num_output: 3 #按训练集类别改,该值为类别数+1 weight_filler { type: "gaussian" std: 0.01 } bias_filler { type: "constant" value: 0 } } }
layer { name: "bbox_pred" type: "InnerProduct" bottom: "fc7" top: "bbox_pred" param { lr_mult: 1.0 } param { lr_mult: 2.0 } inner_product_param { num_output: 12 #按训练集类别改,该值为(类别数+1)*4 weight_filler { type: "gaussian" std: 0.001 } bias_filler { type: "constant" value: 0 } } }(6)py-faster-rcnn-master/models/pascal_voc/ZF/faster_rcnn_alt_opt/ stage2_rpn_train.pt修改
layer { name: 'input-data' type: 'Python' top: 'data' top: 'im_info' top: 'gt_boxes' python_param { module: 'roi_data_layer.layer' layer: 'RoIDataLayer' param_str: "'num_classes': 3" #按训练集类别改,该值为类别数+1 } }(7)py-faster-rcnn-master/models/pascal_voc/ZF/faster_rcnn_alt_opt/ faster_rcnn_test.pt修改
layer { name: "cls_score" type: "InnerProduct" bottom: "fc7" top: "cls_score" inner_product_param { num_output: 3 #按训练集类别改,该值为类别数+1 } }
layer { name: "bbox_pred" type: "InnerProduct" bottom: "fc7" top: "bbox_pred" inner_product_param { num_output: 12 #按训练集类别改,该值为(类别数+1)*4 } }(8)py-faster-rcnn-master/lib/datasets/ pascal_voc.py修改
class pascal_voc(imdb):
def __init__(self, image_set, year, devkit_path=None):
imdb.__init__(self, 'voc_' + year + '_' + image_set)
self._year = year
self._image_set = image_set
self._devkit_path = self._get_default_path() if devkit_path is None \
else devkit_path
self._data_path = os.path.join(self._devkit_path, 'VOC' + self._year)
self._classes = ('__background__', # always index 0
#'aeroplane', 'bicycle', 'bird', 'boat',
#'bottle', 'bus', 'car', 'cat', 'chair',
#'cow', 'diningtable', 'dog', 'horse',
#'motorbike', 'person', 'pottedplant',
#'sheep', 'sofa', 'train', 'tvmonitor'
'cat','dog') #改为你自己的标签
(9)py-faster-rcnn-master/lib/datasets/
imdb.py修改
def append_flipped_images(self): num_images = self.num_images widths = [PIL.Image.open(self.image_path_at(i)).size[0] for i in xrange(num_images)] for i in xrange(num_images): boxes = self.roidb[i]['boxes'].copy() oldx1 = boxes[:, 0].copy() oldx2 = boxes[:, 2].copy() boxes[:, 0] = widths[i] - oldx2 - 1 print boxes[:, 0] boxes[:, 2] = widths[i] - oldx1 - 1 print boxes[:, 0] assert (boxes[:, 2] >= boxes[:, 0]).all() entry = {'boxes' : boxes, 'gt_overlaps' : self.roidb[i]['gt_overlaps'], 'gt_classes' : self.roidb[i]['gt_classes'], 'flipped' : True} self.roidb.append(entry) self._image_index = self._image_index * 2
注意:为防止与之前的模型搞混,训练前把output文件夹删除(或改个其他名),还要把py-faster-rcnn-master/data/cache中的文件和 py-faster-rcnn-master/data/VOCdevkit2007/annotations_cache中的文件删除(如果有的话)。
至于学习率等之类的设置,可在py-faster-rcnn-master/models/pascal_voc/ZF/faster_rcnn_alt_opt中的4个solve文件设置,迭代次数可在py-faster-rcnn-master/tools的train_faster_rcnn_alt_opt.py中修改:
max_iters = [80000, 40000, 80000, 40000]分别为4个阶段(rpn第1阶段,fast rcnn第1阶段,rpn第2阶段,fast rcnn第2阶段)的迭代次数。可改成你希望的迭代次数。
如果改了这些数值,需要把py-faster-rcnn/models/pascal_voc/ZF/faster_rcnn_alt_opt里对应的solver文件(有4个)也修改,stepsize小于上面修改的数值。
按照道理,至此已经全部修改完成,应该可以训练了。
训练
在py-faster-rcnn-master下执行:
./experiments/scripts/faster_rcnn_alt_opt.sh 0 ZF pascal_voc但是此时可能会发生一系列错误,如: AssertionError: num_images (2) must divide BATCH_SIZE (1),这个问题需要追溯到
py-faster-rcnn-master/lib/roi_data_layer/minibatch.py,具体有关于minibatch的内容可参考点击打开链接
def get_minibatch(roidb, num_classes): """Given a roidb, construct a minibatch sampled from it.""" num_images = len(roidb) # Sample random scales to use for each image in this batch random_scale_inds = npr.randint(0, high=len(cfg.TRAIN.SCALES), size=num_images) assert(cfg.TRAIN.BATCH_SIZE % num_images == 0), \ 'num_images ({}) must divide BATCH_SIZE ({})'. \ #这里要求batch_size必须整除num_images
因此我们需要到py-faster-rcnn-master/lib/fast_rcnn/config.py中修改__C.TRAIN.BATCH__SIZE,本来这里是1,我将其修改为8
训练过程如下:
训练完成后,在py-faster-rcnn-master/output/faster_rcnn_alt_opt/voc_2007_trainval/下会有ZF_faster_rcnn_final.caffemodel ,这就是我们用自己的数据集训练得到的最终模型。
测试
将上述的ZF_faster_rcnn_final.caffemodel复制到py-faster-rcnn-master\data\faster_rcnn_models,修改py-faster-rcnn\tools\demo.py:
CLASSES = ('__background__', #'aeroplane', 'bicycle', 'bird', 'boat', #'bottle', 'bus', 'car', 'cat', 'chair', #'cow', 'diningtable', 'dog', 'horse', #'motorbike', 'person', 'pottedplant', #'sheep', 'sofa', 'train', 'tvmonitor' 'cat','dog')#你自己的标签
def parse_args():
"""Parse input arguments."""
parser = argparse.ArgumentParser(description='Faster R-CNN demo')
parser.add_argument('--gpu', dest='gpu_id', help='GPU device id to use [0]',
default=0, type=int)
parser.add_argument('--cpu', dest='cpu_mode',
help='Use CPU mode (overrides --gpu)',
action='store_true')
parser.add_argument('--net', dest='demo_net', help='Network to use [vgg16]',
choices=NETS.keys(), default='zf')#默认模型改为zf
args = parser.parse_args()
return args
# Warmup on a dummy image im = 128 * np.ones((300, 500, 3), dtype=np.uint8) for i in xrange(2): _, _= im_detect(net, im) path = '/home/wlw/Downloads/py-faster-rcnn-master/data/demo'#测试图片路径 for filename in os.listdir(path): im_name=filename print '~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~' print 'Demo for data/demo/{}'.format(im_name) demo(net, im_name) #plt.savefig("/home/wlw/Downloads/py-faster-rcnn-master/data/testfig/"+im_name) plt.show()在终端中运行:
wlw@wlw:~/Downloads/py-faster-rcnn-master/tools$ python demo.py --cpu
因为我只是做一个小练习测试,所以我的整个数据集只有100张图片,加上我的训练迭代次数太少等问题,最后测试出来的图片都为空白,但是整个训练过程是没有问题的,下面我将增加数据集,重新训练。
Over!