yolov3/tiny-yolov3训练和测试 python2/3
生活随笔
收集整理的這篇文章主要介紹了
yolov3/tiny-yolov3训练和测试 python2/3
小編覺得挺不錯的,現在分享給大家,幫大家做個參考.
yolov3
數據集準備
使用labelimg工具標記數據(voc格式)
把標記好的xml文件轉成txt,轉化腳本如下(python2.7)
1, 獲取所有圖像名 ``` import os dirlist = os.listdir("/home/room/mxj_workspace/data/voc_clock/train_img/") fp = open("ImageID.txt","w") for name in dirlist: fp.write(name)fp.write("\n") fp.close() ``` 2,提取xml中的坐標和label轉化為txt,注意修改label名字和路徑,新建好對應的ImageID文件夾,最后把生成的txt拷貝到train_img. import xml.etree.ElementTree as ET import pickle import os from os import listdir, getcwd from os.path import join#sets=[('2012', 'train'), ('2012', 'val'), ('2007', 'train'), ('2007', 'val'), ('2007', 'test')] #classes = ["black_watch","box_watch","white_watch","light_watch","square_watch","sign","IO_State"] classes = ["s_box"]def convert(size, box):dw = 1./(size[0])dh = 1./(size[1])x = (box[0] + box[1])/2.0 - 1y = (box[2] + box[3])/2.0 - 1w = box[1] - box[0]h = box[3] - box[2]x = x*dww = w*dwy = y*dhh = h*dhreturn (x,y,w,h)def convert_annotation (image_id):in_file = open('/home/mahxn0/darknet/box/train_xml/%s.xml'%(image_id))out_file = open('/home/mahxn0/darknet/box/ImageID/%s.txt'%(image_id), 'w')tree=ET.parse(in_file)root = tree.getroot()size = root.find('size')w = int(size.find('width').text)h = int(size.find('height').text)for obj in root.iter('object'):cls = obj.find('name').textif cls not in classes :continuecls_id = classes.index(cls)xmlbox = obj.find('bndbox')b = (float(xmlbox.find('xmin').text), float(xmlbox.find('xmax').text), float(xmlbox.find('ymin').text), float(xmlbox.find('ymax').text))bb = convert((w,h), b)out_file.write(str(cls_id) + " " + " ".join([str(a) for a in bb]) + '\n')#wd = getcwd()#for year, image_set in sets: if not os.path.exists('/home/mahxn0/darknet/box/img_file'):os.makedirs('/home/mahxn0/darknet/box/img_file/') image_ids = open('/home/mahxn0/darknet/box/ImageID.txt').read().strip().split() listtr_file = open('/home/mahxn0/darknet/box/train.list', 'w') listte_file = open('/home/mahxn0/darknet/box/val.list', 'w') i = 0 for image_id in image_ids:i = i+1if(i%10 == 0):listte_file.write('/home/mahxn0/darknet/box/train_img/%s.jpg\n'%( image_id.split('.')[0]))else:listtr_file.write('/home/mahxn0/darknet/box/train_img/%s.jpg\n'%( image_id.split('.')[0]))convert_annotation(image_id.split('.')[0]) listte_file.close() listtr_file.close()#os.system("cat 2007_train.txt 2007_val.txt 2012_train.txt 2012_val.txt > train.txt") #os.system("cat 2007_train.txt 2007_val.txt 2007_test.txt 2012_train.txt 2012_val.txt > train.all.txt")- 修改cfg文件 關鍵:3*(classes+5)
- 找到cfg文件的三處classes位置,classes改成你的檢測類別數,上一層filter修改為:3*(classes+5)
- 修改cfg/coco.data的類別數為你自己檢測的類別數目,train.list和val.list改為你上面用label.py生成的,coco.names里面修改為你自己的label名字,backup是模型保存的位置
開始訓練:
./darknet detector train cfg/coco.data cfg/yolov3.cfg darknet53.conv.74 -gpus 0,1,2,3 Region 23 Avg IOU: 0.331109, Class: 0.552714, Obj: 0.017880, No Obj: 0.021078, .5R: 0.129032, .75R: 0.000000, count: 62219: 5.798628, 26.150927 avg loss, 0.000007 rate, 1.180564 seconds, 42048 images Loaded: 12.885740 seconds Region 16 Avg IOU: 0.210043, Class: 0.500716, Obj: 0.037469, No Obj: 0.031145, .5R: 0.000000, .75R: 0.000000, count: 3 Region 16 Avg IOU: 0.302149, Class: 0.318319, Obj: 0.086097, No Obj: 0.030979, .5R: 0.000000, .75R: 0.000000, count: 1 Region 16 Avg IOU: 0.203812, Class: 0.335673, Obj: 0.063994, No Obj: 0.031331, .5R: 0.000000, .75R: 0.000000, count: 1 Region 23 Avg IOU: 0.312156, Class: 0.556277, Obj: 0.012325, No Obj: 0.019171, .5R: 0.120000, .75R: 0.000000, count: 50 Region 23 Avg IOU: 0.373455, Class: 0.508114, Obj: 0.015595, No Obj: 0.019038, .5R: 0.203390, .75R: 0.000000, count: 59 Region 23 Avg IOU: 0.344760, Class: 0.490172, Obj: 0.013907, No Obj: 0.019223, .5R: 0.187500, .75R: 0.000000, count: 48 Region 16 Avg IOU: 0.454259, Class: 0.426787, Obj: 0.027839, No Obj: 0.031548, .5R: 0.000000, .75R: 0.000000, count: 1 Region 16 Avg IOU: 0.366378, Class: 0.445379, Obj: 0.043471, No Obj: 0.030944, .5R: 0.000000, .75R: 0.000000, count: 2 Region 16 Avg IOU: -nan, Class: -nan, Obj: -nan, No Obj: 0.030927, .5R: -nan, .75R: -nan, count: 0 Region 23 Avg IOU: 0.362018, Class: 0.513913, Obj: 0.014860, No Obj: 0.019196, .5R: 0.224138, .75R: 0.000000, count: 58 Region 23 Avg IOU: 0.278272, Class: 0.531918, Obj: 0.013913, No Obj: 0.019277, .5R: 0.065217, .75R: 0.000000, count: 46 Region 23 Avg IOU: 0.322512, Class: 0.549836, Obj: 0.016681, No Obj: 0.019718, .5R: 0.102564, .75R: 0.000000, count: 39tiny-yolov3
獲取預訓練模型
- 從官方地址下載yolov3-tiny.weights
- 下載 理論上并沒有說提取多少層的特征合適,這里我們提取前15層當作與訓練模型
訓練:
./darknet detector train cfg/coco.data cfg/yolov3-tiny.cfg yolov3-tiny.conv.15 15問題匯總:
1> 多個模型報錯out of memory,把cfg中的batch,sub設置為12>海康相機的解碼:"rtspsrc location=rtsp://admin:123qweasd@192.168.0.222:554/h264/ch1/main/av_stream latency=200 ! rtph264depay ! h264parse ! omxh264dec ! videoconvert ! appsink sync=false"3>剛開始出現nan正常現象,如果全部是nan,是xml生成的txt錯誤或者label名字錯誤,查看coco.names,生成的txt確定文件都正確4>2000張樣本迭代2000-5000次最佳,10000樣本迭代20000次左右(主要看學習率的下降和數據復雜度)測試:
-c index 按照索引打開攝像頭 -out_filename *.avi 保存結果到視頻文件 -thresh 設置檢測置信度 -ext_output < /media/mahxn0/DATA/tool/state3.list > result1.txt 批量測試圖片準確度并且顯示python v2調用(已封裝成python庫)
- 注意:get_network_boxs函數最后的c_int是調節框的準確度的
- 模型的batchsize和subvision測試的時候改成1 ,否則檢測結果會不準確,目前正在查找原因
- free image必須打開釋放內存
pythonv3調用:
from ctypes import * import math import random import cv2 import time import numpy as npdef sample(probs):s = sum(probs)probs = [a/s for a in probs]r = random.uniform(0, 1)for i in range(len(probs)):r = r - probs[i]if r <= 0:return ireturn len(probs)-1def c_array(ctype, values):arr = (ctype*len(values))()arr[:] = valuesreturn arrclass BOX(Structure):_fields_ = [("x", c_float),("y", c_float),("w", c_float),("h", c_float)]class DETECTION(Structure):_fields_ = [("bbox", BOX),("classes", c_int),("prob", POINTER(c_float)),("mask", POINTER(c_float)),("objectness", c_float),("sort_class", c_int)]class IMAGE(Structure):_fields_ = [("w", c_int),("h", c_int),("c", c_int),("data", POINTER(c_float))]class METADATA(Structure):_fields_ = [("classes", c_int),("names", POINTER(c_char_p))]#lib = CDLL("/home/pjreddie/documents/darknet/libdarknet.so", RTLD_GLOBAL) lib = CDLL("/home/mahxn0/darknet/darknet.so", RTLD_GLOBAL) lib.network_width.argtypes = [c_void_p] lib.network_width.restype = c_int lib.network_height.argtypes = [c_void_p] lib.network_height.restype = c_intpredict = lib.network_predict predict.argtypes = [c_void_p, POINTER(c_float)] predict.restype = POINTER(c_float)set_gpu = lib.cuda_set_device set_gpu.argtypes = [c_int]make_image = lib.make_image make_image.argtypes = [c_int, c_int, c_int] make_image.restype = IMAGEget_network_boxes = lib.get_network_boxes get_network_boxes.argtypes = [c_void_p, c_int, c_int, c_float, c_float, POINTER(c_int), c_int, POINTER(c_int)] get_network_boxes.restype = POINTER(DETECTION)make_network_boxes = lib.make_network_boxes make_network_boxes.argtypes = [c_void_p] make_network_boxes.restype = POINTER(DETECTION)free_detections = lib.free_detections free_detections.argtypes = [POINTER(DETECTION), c_int]free_ptrs = lib.free_ptrs free_ptrs.argtypes = [POINTER(c_void_p), c_int]network_predict = lib.network_predict network_predict.argtypes = [c_void_p, POINTER(c_float)]reset_rnn = lib.reset_rnn reset_rnn.argtypes = [c_void_p]load_net = lib.load_network load_net.argtypes = [c_char_p, c_char_p, c_int] load_net.restype = c_void_pdo_nms_obj = lib.do_nms_obj do_nms_obj.argtypes = [POINTER(DETECTION), c_int, c_int, c_float]do_nms_sort = lib.do_nms_sort do_nms_sort.argtypes = [POINTER(DETECTION), c_int, c_int, c_float]free_image = lib.free_image free_image.argtypes = [IMAGE]letterbox_image = lib.letterbox_image letterbox_image.argtypes = [IMAGE, c_int, c_int] letterbox_image.restype = IMAGEload_meta = lib.get_metadata lib.get_metadata.argtypes = [c_char_p] lib.get_metadata.restype = METADATAload_image = lib.load_image_color load_image.argtypes = [c_char_p, c_int, c_int] load_image.restype = IMAGErgbgr_image = lib.rgbgr_image rgbgr_image.argtypes = [IMAGE]predict_image = lib.network_predict_image predict_image.argtypes = [c_void_p, IMAGE] predict_image.restype = POINTER(c_float)net = load_net(b"model/yolo_box/box.cfg", b"model/yolo_box/box.weights", 0) meta = load_meta(b"model/yolo_box/box.data")class yolo(object):def __init__(self):passdef convertBack(self,x, y, w, h):xmin = int(round(x - (w / 2)))xmax = int(round(x + (w / 2)))ymin = int(round(y - (h / 2)))ymax = int(round(y + (h / 2)))return xmin, ymin, xmax, ymaxdef array_to_image(self,arr):# need to return old values to avoid python freeing memoryarr = arr.transpose(2,0,1)c, h, w = arr.shape[0:3]arr = np.ascontiguousarray(arr.flat, dtype=np.float32) / 255.0data = arr.ctypes.data_as(POINTER(c_float))im = IMAGE(w,h,c,data)return im, arrdef detect(self,image, thresh=.5, hier_thresh=.5, nms=.45):im, image = self.array_to_image(image)rgbgr_image(im)num = c_int(0)pnum = pointer(num)predict_image(net, im)dets = get_network_boxes(net, im.w, im.h, thresh,hier_thresh, None, 0, pnum)num = pnum[0]if nms: do_nms_obj(dets, num, meta.classes, nms)res = []for j in range(num):a = dets[j].prob[0:meta.classes]if any(a):ai = np.array(a).nonzero()[0]for i in ai:b = dets[j].bbox#res.append((meta.names[i], dets[j].prob[i],# (b.x, b.y, b.w, b.h)))left=(b.x-b.w/2)right=(b.x+b.w/2)top=(b.y-b.h/2)bot=(b.y+b.h/2)if left < 0:left = 0if right > im.w-1:right = im.w-1if top < 0:top = 0if bot > im.h-1:bot = im.h-1res.append((meta.names[i], dets[j].prob[i],left,top,right,bot))res = sorted(res, key=lambda x: -x[1])if isinstance(image, bytes): free_image(im)free_detections(dets, num)return resif __name__ == "__main__":# load video herecap = cv2.VideoCapture("board0.mp4")ret, img = cap.read()fps = cap.get(cv2.CAP_PROP_FPS)yolo=yolo()print("Frames per second using video.get(cv2.CAP_PROP_FPS) : {0}".format(fps))cv2.namedWindow("img", cv2.WINDOW_NORMAL)while(1):ret, img = cap.read()if ret:# r = detect_np(net, meta, img)r = yolo.detect(img)for i in r:x, y, w, h = i[2][0], i[2][1], i[2][2], i[2][3]xmin, ymin, xmax, ymax = yolo.convertBack(float(x), float(y), float(w), float(h))pt1 = (xmin, ymin)pt2 = (xmax, ymax)cv2.rectangle(img, pt1, pt2, (0, 255, 0), 2)cv2.putText(img, i[0].decode() + " [" + str(round(i[1] * 100, 2)) + "]", (pt1[0], pt1[1] + 20), cv2.FONT_HERSHEY_SIMPLEX, 1, [0, 255, 0], 4)cv2.imshow("img", img)if cv2.waitKey(1) & 0xFF == ord('q'):break小禮物走一走,來簡書關注我
作者:Mahxn0
鏈接:https://www.jianshu.com/p/9c87e039c949
來源:簡書
簡書著作權歸作者所有,任何形式的轉載都請聯系作者獲得授權并注明出處。
總結
以上是生活随笔為你收集整理的yolov3/tiny-yolov3训练和测试 python2/3的全部內容,希望文章能夠幫你解決所遇到的問題。
- 上一篇: YOLOv3 学习笔记:大神好贴汇总+自
- 下一篇: 使用caffe训练时Loss变为nan的