YOLOv3训练自己的数据集
最近在做目標(biāo)檢測相關(guān)的工作,先用faster R-CNN訓(xùn)練了一下,感覺效果不是很好,換成了res101特征提取,mAP也只達到了30+,于是想換個模型試試,就選擇了yolov3
 yolov3的訓(xùn)練需要用到自己的標(biāo)簽格式,我就直接拿了上次制作的VOC格式數(shù)據(jù)集進行處理和訓(xùn)練。https://blog.csdn.net/qq_36852276/article/details/100154097
 記錄一下
 https://blog.csdn.net/helloworld1213800/article/details/79749359
 https://blog.csdn.net/weixin_41813620/article/details/92799338
 https://blog.csdn.net/xuanlang39/article/details/88642010
常見問題和注釋
https://blog.csdn.net/maweifei/article/details/81137563
筆記
將文件夾內(nèi)的絕對路徑寫入txt文件
ls -R /home/datalab/work/datasets/test_7pilang/*.jpg > file.txt
無絕對路徑,當(dāng)前文件夾文件名寫入txt文件
ls -R *.jpg > file.txt
訓(xùn)練自己數(shù)據(jù)集的流程
下載darknet項目
git clone https://github.com/pjreddie/darknet cd darknet修改Makefile
主要是修改3處 一是GPU、CUDNN、OPENCV 二是修改算力符合自己的顯卡 三是cudnn路徑改成自己的GPU=1 CUDNN=1 OPENCV=1 OPENMP=0 DEBUG=0ARCH= -gencode arch=compute_61,code=sm_61 # -gencode arch=compute_35,code=sm_35 \ # -gencode arch=compute_50,code=[sm_50,compute_50] \ # -gencode arch=compute_52,code=[sm_52,compute_52] # -gencode arch=compute_20,code=[sm_20,sm_21] \ This one is deprecated?# This is what I use, uncomment if you know your arch and want to specify # ARCH= -gencode arch=compute_52,code=compute_52ifeq ($(GPU), 1) COMMON+= -DGPU -I/usr/local/cuda/include/ CFLAGS+= -DGPU LDFLAGS+= -L/usr/local/cuda/lib64 -lcuda -lcudart -lcublas -lcurand endif
 
 
修改完之后執(zhí)行編譯 make
準(zhǔn)備自己的數(shù)據(jù)集
我使用的是labelme進行標(biāo)注,然后把產(chǎn)生的json文件轉(zhuǎn)化成xml格式,接著進行數(shù)據(jù)增強,我自己的腳本如下所示
 https://blog.csdn.net/qq_36852276/article/details/102539858
有了圖片和標(biāo)簽后,把圖片和標(biāo)簽整理成VOC數(shù)據(jù)集的格式,為了方便放在scripts文件夾下
 文件夾結(jié)構(gòu)是
 
其中Main下又分別是四個文件 train.txt trainval.txt val.txt test.txt
 它們分別是不帶后綴的文件名
接著利用項目中的./scripts/voc_label.py 直接生成yolo所需的文件
 要修改一部分適應(yīng)自己的數(shù)據(jù),這里一共三處,注釋掉的是原來的,我只需要識別一類boat
 分別是
 sets
 classes
 os.system
 
準(zhǔn)備預(yù)訓(xùn)練權(quán)重
從官網(wǎng)上就可以下載
wget https://pjreddie.com/media/files/darknet53.conv.74為了方便放在model文件下
修改配置文件cfg/voc.data
這個要修改成適應(yīng)自己數(shù)據(jù)的地方
 classes改為類別數(shù)
 train,valid是生成的yolo所需的絕對路徑的txt文件序列
 names是類別的txt文件序列
修改配置文件data/voc.name
改成自己的標(biāo)簽名
 
修改配置文件cfg/yolov3-voc.cfg
這個就是模型的各項配置,各項介紹如下
[net] # Testing ### 測試模式 # batch=1 # subdivisions=1 # Training ### 訓(xùn)練模式,每次前向的圖片數(shù)目 = batch/subdivisions batch=64 subdivisions=16 width=416 ### 網(wǎng)絡(luò)的輸入寬、高、通道數(shù) height=416 channels=3 momentum=0.9 ### 動量 decay=0.0005 ### 權(quán)重衰減 angle=0 saturation = 1.5 ### 飽和度 exposure = 1.5 ### 曝光度 hue=.1 ### 色調(diào) learning_rate=0.001 ### 學(xué)習(xí)率 burn_in=1000 ### 學(xué)習(xí)率控制的參數(shù) max_batches = 5000 ### 迭代次數(shù) policy=steps ### 學(xué)習(xí)率策略 steps=40000,45000 ### 學(xué)習(xí)率變動步長 scales=.1,.1 ### 學(xué)習(xí)率變動因子 [convolutional] batch_normalize=1 ### BN filters=32 ### 卷積核數(shù)目 size=3 ### 卷積核尺寸 stride=1 ### 卷積核步長 pad=1 ### pad activation=leaky ### 激活函數(shù)這個文件共需要修改6處
 其實是有一個一模一樣的結(jié)構(gòu)出現(xiàn)了3次,每個都要修改2個地方,就是下面的結(jié)構(gòu),建議搜索yolo關(guān)鍵字就能找到這三處
上述的filters、classes需要修改 修改方式看上面注釋
 除此之外,還有幾個關(guān)鍵的地方要考慮
開始訓(xùn)練
#三個路徑分別是修改好的 數(shù)據(jù)集文件、模型配置文件、預(yù)訓(xùn)練權(quán)重 ./darknet detector train cfg/voc.data cfg/yolov3-voc.cfg ./model/darknet53.conv.74 -gpus 0訓(xùn)練好的模型測試
方法1
#這個是默認(rèn)使用了coco.names,有80類 ./darknet detect cfg/yolov3-voc.cfg backup/yolov3-voc_final.weights data/dog.jpg方法2
./darknet detector test cfg/voc.data cfg/yolov3-voc.cfg backup/yolov3-voc_final.weights xxx/xxx.jpg調(diào)用python接口
darknet項目下有一個python文件夾,里面的darknet.py就是官方給出的python接口,將darknet.py移動到darknet項目路徑下,然后可以對著它進行修改即可,注意它產(chǎn)生的結(jié)果是x,y,w,h的格式,就是中心點坐標(biāo)、寬高
 下面是我直接使用的例子
使用python接口對視頻、圖像進行處理
預(yù)留的端口只能輸入路徑,也就是只能進行單張離線圖片測試,不是很方便,于是可以對源碼進行修改,將接口輸入路徑修改為輸入圖片
 看這位大佬的博客,親測可用
https://blog.csdn.net/phinoo/article/details/83009061
修改后就為所欲為,放一個我讀取RTSP網(wǎng)絡(luò)視頻流進行處理的例子
from ctypes import * import math import random import cv2 import os import sys import threading#define global variable count = 0 flag = 0def sample(probs):s = sum(probs)probs = [a/s for a in probs]r = random.uniform(0, 1)for i in range(len(probs)):r = r - probs[i]if r <= 0:return ireturn len(probs)-1def c_array(ctype, values):arr = (ctype*len(values))()arr[:] = valuesreturn arrclass BOX(Structure):_fields_ = [("x", c_float),("y", c_float),("w", c_float),("h", c_float)]class DETECTION(Structure):_fields_ = [("bbox", BOX),("classes", c_int),("prob", POINTER(c_float)),("mask", POINTER(c_float)),("objectness", c_float),("sort_class", c_int)]class IMAGE(Structure):_fields_ = [("w", c_int),("h", c_int),("c", c_int),("data", POINTER(c_float))]class METADATA(Structure):_fields_ = [("classes", c_int),("names", POINTER(c_char_p))]#lib = CDLL("/home/pjreddie/documents/darknet/libdarknet.so", RTLD_GLOBAL) lib = CDLL("/home/xxx/darknet_1/darknet/libdarknet.so", RTLD_GLOBAL) lib.network_width.argtypes = [c_void_p] lib.network_width.restype = c_int lib.network_height.argtypes = [c_void_p] lib.network_height.restype = c_intndarray_image = lib.ndarray_to_image ndarray_image.argtypes = [POINTER(c_ubyte), POINTER(c_long), POINTER(c_long)] ndarray_image.restype = IMAGEpredict = lib.network_predict predict.argtypes = [c_void_p, POINTER(c_float)] predict.restype = POINTER(c_float)set_gpu = lib.cuda_set_device set_gpu.argtypes = [c_int]make_image = lib.make_image make_image.argtypes = [c_int, c_int, c_int] make_image.restype = IMAGEget_network_boxes = lib.get_network_boxes get_network_boxes.argtypes = [c_void_p, c_int, c_int, c_float, c_float, POINTER(c_int), c_int, POINTER(c_int)] get_network_boxes.restype = POINTER(DETECTION)make_network_boxes = lib.make_network_boxes make_network_boxes.argtypes = [c_void_p] make_network_boxes.restype = POINTER(DETECTION)free_detections = lib.free_detections free_detections.argtypes = [POINTER(DETECTION), c_int]free_ptrs = lib.free_ptrs free_ptrs.argtypes = [POINTER(c_void_p), c_int]network_predict = lib.network_predict network_predict.argtypes = [c_void_p, POINTER(c_float)]reset_rnn = lib.reset_rnn reset_rnn.argtypes = [c_void_p]load_net = lib.load_network load_net.argtypes = [c_char_p, c_char_p, c_int] load_net.restype = c_void_pdo_nms_obj = lib.do_nms_obj do_nms_obj.argtypes = [POINTER(DETECTION), c_int, c_int, c_float]do_nms_sort = lib.do_nms_sort do_nms_sort.argtypes = [POINTER(DETECTION), c_int, c_int, c_float]free_image = lib.free_image free_image.argtypes = [IMAGE]letterbox_image = lib.letterbox_image letterbox_image.argtypes = [IMAGE, c_int, c_int] letterbox_image.restype = IMAGEload_meta = lib.get_metadata lib.get_metadata.argtypes = [c_char_p] lib.get_metadata.restype = METADATAload_image = lib.load_image_color load_image.argtypes = [c_char_p, c_int, c_int] load_image.restype = IMAGErgbgr_image = lib.rgbgr_image rgbgr_image.argtypes = [IMAGE]predict_image = lib.network_predict_image predict_image.argtypes = [c_void_p, IMAGE] predict_image.restype = POINTER(c_float)def nparray_to_image(img):data = img.ctypes.data_as(POINTER(c_ubyte))image = ndarray_image(data, img.ctypes.shape, img.ctypes.strides)return imagedef classify(net, meta, im):out = predict_image(net, im)res = []for i in range(meta.classes):res.append((meta.names[i], out[i]))res = sorted(res, key=lambda x: -x[1])return resdef detect(net, meta, im, thresh=.5, hier_thresh=.5, nms=.45):num = c_int(0)pnum = pointer(num)predict_image(net, im)dets = get_network_boxes(net, im.w, im.h, thresh, hier_thresh, None, 0, pnum)num = pnum[0]if (nms): do_nms_obj(dets, num, meta.classes, nms);res = []for j in range(num):for i in range(meta.classes):if dets[j].prob[i] > 0:b = dets[j].bboxres.append((meta.names[i], dets[j].prob[i], (b.x, b.y, b.w, b.h)))res = sorted(res, key=lambda x: -x[1])free_image(im)free_detections(dets, num)return resdef deal_img():global arrglobal countglobal netglobal metaglobal flag # global video_writerwhile True:arr_temp = arrif flag==1:if count%3==0:cv2.imwrite('/home/xxx/wurenting/dataset_10_21/' +str(int(count/3))+'.jpg',arr_temp)count += 1img = nparray_to_image(arr_temp)r = detect(net, meta, img,thresh=.6,nms=.3)#draw triangleweight_img = arr_temp.shape[1]height_img = arr_temp.shape[0]for r_res in r:cls = r_res[0]score = r_res[1]x = r_res[2][0]y = r_res[2][1]w = r_res[2][2]h = r_res[2][3]x_left_top = float("%.2f"%(x - w/2))y_left_top = float("%.2f"%(y - h/2))x_right_bottom = float("%.2f"%(x + w/2))y_right_bottom = float("%.2f"%(y + h/2))if x_left_top > weight_img:x_left_top = weight_imgif x_right_bottom > weight_img:x_right_bottom = weight_imgif y_left_top > height_img:y_left_top = height_imgif y_right_bottom > height_img:y_right_bottom = height_imgif x_left_top<0:x_left_top = 0if x_right_bottom<0:x_right_bottom = 0if y_left_top < 0:y_left_top = 0if y_right_bottom < 0:y_right_bottom = 0bbox = (x_left_top,y_left_top,x_right_bottom,y_right_bottom)# with open('/home/xxx/guangdong/test_b_yolo.txt','a') as xxx_obj:# xxx_obj.write(filename + '_' + cls.decode('utf-8') + '_' + str(bbox) + '_' + str(score) + '_\n') cv2.rectangle(arr_temp, (int(x_left_top),int(y_left_top)),(int(x_right_bottom),int(y_right_bottom)), (0, 204, 0), 2)cv2.putText(arr_temp, '%s: %.3f' % (str(cls)[2:-1], score), (int(x_left_top), int(y_left_top) + 15), cv2.FONT_HERSHEY_PLAIN,1.0, (0, 0, 255), thickness=2)cv2.namedWindow("res", cv2.WINDOW_NORMAL)cv2.imshow("res",arr_temp)# video_writer.write(arr_temp)if cv2.waitKey(1)==ord('k'):flag = 1elif cv2.waitKey(1)==ord('q'):flag = 0# video_writer.release()# print(r) def change_arr():global countglobal arrglobal vidwhile True:return_value , arr = vid.read()#cv2.imwrite('/home/xxx/wurenting/dataset_10_17/' +str(count)+'.jpg',arr_temp)#count+=1if __name__ == "__main__":#net = load_net("cfg/densenet201.cfg", "/home/pjreddie/trained/densenet201.weights", 0)#im = load_image("data/wolf.jpg", 0, 0)#meta = load_meta("cfg/imagenet1k.data")#r = classify(net, meta, im)#print r[:10]#加載官方參數(shù)#net = load_net(b"cfg/yolov3.cfg", b"yolov3.weights", 0)#meta = load_meta(b"cfg/coco.data")#加載船只檢測參數(shù)net = load_net(b"/homexxx/darknet_2/darknet/cfg/yolov3-voc.cfg", b"/home/xxx/darknet_2/darknet/backup/yolov3-voc_final.weights", 0)meta = load_meta(b"/home/xxx/darknet_2/darknet/cfg/voc.data")vid = cv2.VideoCapture("rtsp://admin:12345@192.168.1.113:554/")#vid = cv2.VideoCapture("rtsp://127.0.0.1:8554/test")#vid = cv2.VideoCapture("rtsp://192.168.1.146:8553/PSIA/Streaming/channels/0?videoCodecType=H.264")#vid = cv2.VideoCapture("/home/xxx/wurenting/2019_10_16_16_07_IMG_1522.mp4")return_value , arr = vid.read()#video_writer = cv2.VideoWriter('/home/xxx/wurenting/video_result_10_21/result.avi', cv2.VideoWriter_fourcc('M','J','P','G'), 30, (1920,1080))thread_1 = threading.Thread(target=deal_img) # 定義線程 1thread_2 = threading.Thread(target=change_arr) # 定義線程 2thread_1.start() # 讓線程1開始工作thread_2.start() # 讓線程2開始工作其他的問題,例如修改保存權(quán)重的迭代值之類的問題,看最上面的常見問題、注釋那篇博客
使用已知json和圖片將目標(biāo)切割出來代碼
import os from lxml.etree import Element, SubElement, tostring from xml.dom.minidom import parseString from PIL import Image import json import cv2json_dir = '/home/xbw/wurenting/train_buoy_1117/red/json/' img_dir = '/home/xbw/wurenting/train_buoy_1117/red/img/' json_list = os.listdir(json_dir) json_list.sort(key=lambda x:int(x[:-5])) for image_name in json_list:print(image_name)img_source = cv2.imread(img_dir+image_name[:-5]+'.jpg')with open(json_dir+image_name) as obj:nums = json.load(obj)labels = []bboxes = []for i in nums['shapes']:labels.append(i['label'])bboxes.append([min(i['points'][0][0],i['points'][1][0]),min(i['points'][0][1],i['points'][1][1]),max(i['points'][0][0],i['points'][1][0]),max(i['points'][0][1],i['points'][1][1])])bbox = bboxes[0]crop = img_source[int(bbox[1]):int(bbox[3]),int(bbox[0]):int(bbox[2])]cv2.imwrite("/home/xbw/wurenting/train_buoy_1117/red/crop/"+image_name[:-5]+'.jpg',crop)cv2.imshow("res",crop)cv2.waitKey(1)總結(jié)
以上是生活随笔為你收集整理的YOLOv3训练自己的数据集的全部內(nèi)容,希望文章能夠幫你解決所遇到的問題。
 
                            
                        - 上一篇: Android 口令实现(自己复制,返回
- 下一篇: lisp编写面积为亩_CASS平方米转换
