SSD-Tensorflow训练总结-FinClip官网

SSD-Tensorflow训练总结

网友投稿 1032 2022-10-01

SSD-Tensorflow训练总结

感想

今天我测试了一下我自己训练的模型，和YOLOv2做了一下对比，检测的都是对的，YOLOv2版本的准确率不高，但是SSD有很多没有检测出来，召回率不怎么高。

注意，ssd的环境是python3，在python2上跑会有问题。tensorflow-gpu, opencv的安装参考我的博客： SSD环境安装

1 制作数据集

最麻烦的是制作voc数据集，我这里用了公司的数据集生成器产生了很多张图片，总量大概有25000张左右。按照voc格式，把图片放在

JPEGImages目录下，xml格式的文件放在Annotations目录下，然后利用程序生成train.txt, test.txt, trainval.txt, val.txt四个文件就够了。生成这些txt的代码如下：

import osimport random xmlfilepath=r'/home/whsyxt/Downloads/SSD-Tensorflow/VOC2007/Annotations'saveBasePath=r"/home/whsyxt/Downloads/SSD-Tensorflow"trainval_percent=0.8train_percent=0.7total_xml = os.listdir(xmlfilepath)num=len(total_xml) list=range(num) tv=int(num*trainval_percent) tr=int(tv*train_percent) trainval= random.sample(list,tv) train=random.sample(trainval,tr) print("train and val size",tv)print("traub suze",tr)ftrainval = open(os.path.join(saveBasePath,'VOC2007/ImageSets/Main/trainval.txt'), 'w') ftest = open(os.path.join(saveBasePath,'VOC2007/ImageSets/Main/test.txt'), 'w') ftrain = open(os.path.join(saveBasePath,'VOC2007/ImageSets/Main/train.txt'), 'w') fval = open(os.path.join(saveBasePath,'VOC2007/ImageSets/Main/val.txt'), 'w') for i in list: name=total_xml[i][:-4]+'\n' if i in trainval: ftrainval.write(name) if i in train: ftrain.write(name) else: fval.write(name) else: ftest.write(name) ftrainval.close() ftrain.close() fval.close() ftest .close()

读者可以按照自己的方式去改。

2 voc转tfrecords

voc格式的数据集制作好以后，我们需要把数据集转换成tfrecords，这样程序才能跑，首先，我们需要修改一下源码，datasets\pascalvoc_common.py，操作也非常简单，你把你的类别填上就行了，其他的都不用管，看我的示例，我把原来的16类弄成了3类：

"""VOC_LABELS = { 'none': (0, 'Background'), 'aeroplane': (1, 'Vehicle'), 'bicycle': (2, 'Vehicle'), 'bird': (3, 'Animal'), 'boat': (4, 'Vehicle'), 'bottle': (5, 'Indoor'), 'bus': (6, 'Vehicle'), 'car': (7, 'Vehicle'), 'cat': (8, 'Animal'), 'chair': (9, 'Indoor'), 'cow': (10, 'Animal'), 'diningtable': (11, 'Indoor'), 'dog': (12, 'Animal'), 'horse': (13, 'Animal'), 'motorbike': (14, 'Vehicle'), 'Person': (15, 'Person'), 'pottedplant': (16, 'Indoor'), 'sheep': (17, 'Animal'), 'sofa': (18, 'Indoor'), 'train': (19, 'Vehicle'), 'tvmonitor': (20, 'Indoor'),}"""VOC_LABELS = { 'none': (0, 'Background'), 'person': (1, 'Person'), 'car': (2, 'Car'),}

这样就行了。

接着跳转到SSD-tensorflow目录下，进行tfrecords操作，我的运行命令如下：

DATASET_DIR=VOC2007/OUTPUT_DIR=tfrecords/python3 tf_convert_data.py \ --dataset_name=pascalvoc \ --dataset_dir=${DATASET_DIR} \ --output_name=voc_2007_train \ --output_dir=${OUTPUT_DIR}

3 训练

DATASET_DIR=tfrecordsTRAIN_DIR=logs/CHECKPOINT_PATH=./checkpoints/ssd_300_vgg.ckptpython3 train_ssd_network.py \ --train_dir=${TRAIN_DIR} \ --dataset_dir=${DATASET_DIR} \ --dataset_name=pascalvoc_2007 \ --dataset_split_name=train \ --model_name=ssd_300_vgg \ --checkpoint_path=${CHECKPOINT_PATH} \ --save_summaries_secs=60 \ --save_interval_secs=600 \ --weight_decay=0.0005 \ --optimizer=adam \ --learning_rate=0.001 \ --batch_size=16

4 预测

命令：

python3 video_demo.py

代码：

#coding=utf-8import osimport mathimport randomimport numpy as npimport tensorflow as tfimport cv2slim = tf.contrib.slimimport matplotlib.pyplot as pltimport matplotlib.image as mpimgimport syssys.path.append('../')from nets import ssd_vgg_300, ssd_common, np_methodsfrom preprocessing import ssd_vgg_preprocessingfrom notebooks import visualization# TensorFlow session: grow memory when needed. TF, DO NOT USE ALL MY GPU MEMORY!!!gpu_options = tf.GPUOptions(allow_growth=True)config = tf.ConfigProto(log_device_placement=False, gpu_options=gpu_options)isess = tf.InteractiveSession(config=config)# Input placeholder-_shape = (300, 300)data_format = 'NHWC'img_input = tf.placeholder(tf.uint8, shape=(None, None, 3))# Evaluation pre-processing: resize to SSD net shape.image_pre, labels_pre, bboxes_pre, bbox_img = ssd_vgg_preprocessing.preprocess_for_eval( img_input, None, None, net_shape, data_format, resize=ssd_vgg_preprocessing.Resize.WARP_RESIZE)image_4d = tf.expand_dims(image_pre, 0)# Define the SSD model.reuse = True if 'ssd_net' in locals() else Nonessd_net = ssd_vgg_300.SSDNet()with slim.arg_scope(ssd_net.arg_scope(data_format=data_format)): predictions, localisations, _, _ = ssd_net-(image_4d, is_training=False, reuse=reuse)# Restore SSD model.ckpt_filename = 'finetune_log/model.ckpt-41278' //修改为你的模型路径#ckpt_filename = 'checkpoints/ssd_300_vgg.ckpt'isess.run(tf.global_variables_initializer())saver = tf.train.Saver()saver.restore(isess, ckpt_filename)# SSD default anchor boxes.ssd_anchors = ssd_net.anchors(net_shape)# Main image processing routine.def process_image(img, select_threshold=0.5, nms_threshold=.45, net_shape=(300, 300)): # Run SSD network. rimg, rpredictions, rlocalisations, rbbox_img = isess.run([image_4d, predictions, localisations, bbox_img], feed_dict={img_input: img}) # Get classes and bboxes from the net outputs. rclasses, rscores, rbboxes = np_methods.ssd_bboxes_select( rpredictions, rlocalisations, ssd_anchors, select_threshold=select_threshold, img_shape=net_shape, num_classes=21, decode=True) rbboxes = np_methods.bboxes_clip(rbbox_img, rbboxes) rclasses, rscores, rbboxes = np_methods.bboxes_sort(rclasses, rscores, rbboxes, top_k=400) rclasses, rscores, rbboxes = np_methods.bboxes_nms(rclasses, rscores, rbboxes, nms_threshold=nms_threshold) # Resize bboxes to original image shape. Note: useless for Resize.WARP! rbboxes = np_methods.bboxes_resize(rbbox_img, rbboxes) return rclasses, rscores, rbboxesdef bboxes_draw_on_img(img, classes, scores, bboxes, color=[255, 0, 0], thickness=2): shape = img.shape for i in range(bboxes.shape[0]): bbox = bboxes[i] #color = colors[classes[i]] # Draw bounding box... p1 = (int(bbox[0] * shape[0]), int(bbox[1] * shape[1])) p2 = (int(bbox[2] * shape[0]), int(bbox[3] * shape[1])) cv2.rectangle(img, p1[::-1], p2[::-1], color, thickness) # Draw text... s = '%s/%.3f' % (classes[i], scores[i]) p1 = (p1[0]-5, p1[1]) cv2.putText(img, s, p1[::-1], cv2.FONT_HERSHEY_DUPLEX, 0.4, color, 1)cap = cv2.VideoCapture("DJI_0008.MOV") //修改为你的路径#cap = cv2.VideoCapture(0)# Define the codec and create VideoWriter object#fourcc = cv2.cv.FOURCC(*'XVID')fourcc = cv2.VideoWriter_fourcc(*'XVID') out = cv2.VideoWriter('output1.avi', fourcc, 20, (1280, 720))num=0while cap.isOpened(): # get a frame rval, frame = cap.read() # save a frame if rval==True: # frame = cv2.flip(frame,0) rclasses, rscores, rbboxes=process_image(frame) bboxes_draw_on_img(frame,rclasses,rscores,rbboxes) print(rclasses) out.write(frame) num=num+1 print(num) else: break # show a frame cv2.imshow("capture", frame) if cv2.waitKey(1) & 0xFF == ord('q'): breakcap.release()out.release()cv2.destroyAllWindows()

参考文献

[1] SSD-Tensorflow. https://github.com/balancap/SSD-Tensorflow

信创国产化替换如何推动企业自主创新与市场竞争力提升

1032 2022-10-01

SSD-Tensorflow训练总结

信创国产化替换如何推动企业自主创新与市场竞争力提升

信创国产化政策如何推动企业技术转型与市场竞争力提升

信创国产化中间件在数字化转型中的关键作用与挑战

最近发表

更多内容

小程序SDK

Finclip技术文档

小程序开发

小程序容器

小程序框架

Finclip小程序平台

Finclip用户投稿

车联网

推荐文章

小程序SDK是什么意思？小程序sdk和插件有什么区别？

小程序支付功能怎么实现？

企业app开发流程是什么？

app运营模式有哪些？

小程序多端引流怎么做？

小程序生态分析的机会和威胁

Flutter入门这一篇效率文章就够了

原生与跨平台解决方案分析,跨平台软件开发技术方案

热更新技术：让软件更新变得更加轻松快速

解决方案

银行解决方案

证券解决方案

互联网解决方案

政企OA解决方案

科技解决方案

loT解决方案

信任解决方案

热评文章

AppCan:基于混合模式的移动应用开发,移动混合模

Hybrid App混合模式开发的了解

小程序容器技术助力券商数字营销突围，小程序容器化的意

用mpvue开发微信小程序基础知识（vue.js开发

小程序多端框架全面测评对比，强烈推荐！

券商app架构 - 解析券商应用程序的构建与设计