當(dāng)前位置：首頁 > 编程资源 > 编程问答 >内容正文

编程问答

YOLO v3解析与实现

發(fā)布時(shí)間：2025/3/15 编程问答 26 豆豆

生活随笔收集整理的這篇文章主要介紹了 YOLO v3解析与实现小編覺得挺不錯(cuò)的,現(xiàn)在分享給大家,幫大家做個(gè)參考.

前言

又到了一年考試周，去年本來想實(shí)現(xiàn)深度學(xué)習(xí)目標(biāo)檢測(cè)，結(jié)果因?yàn)楦鞣N問題沒有做，現(xiàn)在趁機(jī)會(huì)實(shí)現(xiàn)一下。

YOLOv3在YOLOv2的基礎(chǔ)進(jìn)行了一些改進(jìn)，這些更改使其效果變得更好。在320×320的圖像上，YOLOv3運(yùn)行速度達(dá)到了22.2毫秒，mAP為28.2。其與SSD一樣準(zhǔn)確，但速度快了三倍，具體效果如下圖。本文對(duì)YOLO v3的改進(jìn)點(diǎn)進(jìn)行了總結(jié)，并實(shí)現(xiàn)了一個(gè)基于Keras的YOLOv3檢測(cè)模型。
inference

Paper：YOLOv3: An Incremental Improvement
Official website：https://pjreddie.com/darknet/yolo
Github：https://github.com/xiaochus/YOLOv3

環(huán)境

Python 3.6 Tensorflow-gpu 1.5.0 Keras 2.1.3 OpenCV 3.4

改進(jìn)點(diǎn)

1.Darknet-53特征提取網(wǎng)絡(luò)

不同于Darknet-19，YOLO v3中使用了一個(gè)53層的卷積網(wǎng)絡(luò)，這個(gè)網(wǎng)絡(luò)由殘差單元疊加而成。根據(jù)作者的實(shí)驗(yàn)，在分類準(zhǔn)確度上跟效率的平衡上，這個(gè)模型比ResNet-101、 ResNet-152和Darknet-19表現(xiàn)得更好。
Darknet-53

2.邊界框預(yù)測(cè)

基本的坐標(biāo)偏移公式與YOLO v2相同。

box
YOLO v3使用邏輯回歸預(yù)測(cè)每個(gè)邊界框的分?jǐn)?shù)。如果先驗(yàn)邊界框與真實(shí)框的重疊度比之前的任何其他邊界框都要好，則該值應(yīng)該為1。如果先驗(yàn)邊界框不是最好的，但確實(shí)與真實(shí)對(duì)象的重疊超過某個(gè)閾值(這里是0.5)，那么就忽略這次預(yù)測(cè)。YOLO v3只為每個(gè)真實(shí)對(duì)象分配一個(gè)邊界框，如果先驗(yàn)邊界框與真實(shí)對(duì)象不吻合，則不會(huì)產(chǎn)生坐標(biāo)或類別預(yù)測(cè)損失，只會(huì)產(chǎn)生物體預(yù)測(cè)損失。

3.類別預(yù)測(cè)

為了實(shí)現(xiàn)多標(biāo)簽分類，模型不再使用softmax函數(shù)作為最終的分類器，而是使用logistic作為分類器，使用 binary cross-entropy作為損失函數(shù)。

4.多尺度預(yù)測(cè)

不同于之前的YOLO，YOLO v3從三種不同尺度的特征圖譜上進(jìn)行預(yù)測(cè)任務(wù)。

在Darknet-53得到的特征圖的基礎(chǔ)上，經(jīng)過7個(gè)卷積得到第一個(gè)特征圖譜，在這個(gè)特征圖譜上做第一次預(yù)測(cè)。然后從后向前獲得倒數(shù)第3個(gè)卷積層的輸出，進(jìn)行一次卷積一次x2上采樣，將上采樣特征與第43個(gè)卷積特征連接，經(jīng)過7個(gè)卷積得到第二個(gè)特征圖譜，在這個(gè)特征圖譜上做第二次預(yù)測(cè)。然后從后向前獲得倒數(shù)第3個(gè)卷積層的輸出，進(jìn)行一次卷積一次x2上采樣，將上采樣特征與第26個(gè)卷積特征連接，經(jīng)過7個(gè)卷積得到第三個(gè)特征圖譜，在這個(gè)特征圖譜上做第三次預(yù)測(cè)。

每個(gè)預(yù)測(cè)任務(wù)得到的特征大小都為N ×N ×[3?(4+1+80)] ，N為格子大小，3為每個(gè)格子得到的邊界框數(shù)量， 4是邊界框坐標(biāo)數(shù)量，1是目標(biāo)預(yù)測(cè)值，80是類別數(shù)量。

out

實(shí)驗(yàn)

實(shí)現(xiàn)了一個(gè)輸入大小為(416, 416)的yolo v3檢測(cè)模型，模型使用了coco訓(xùn)練的權(quán)值文件。
權(quán)值文件轉(zhuǎn)換

參考了yad2k項(xiàng)目的轉(zhuǎn)換方法，我們?yōu)槠涮砑恿藥讉€(gè)新的層，用來將Darknet的網(wǎng)絡(luò)結(jié)構(gòu)和權(quán)值文件轉(zhuǎn)換為keras 2的網(wǎng)絡(luò)結(jié)構(gòu)和權(quán)值文件。

首先下載權(quán)值文件yolov3.weights

執(zhí)行下列命令轉(zhuǎn)換

python yad2k.py cfg\yolo.cfg yolov3.weights data\yolo.h5

檢測(cè)

demo.py文件提供了使用yolo v3進(jìn)行檢測(cè)的例子。圖片檢測(cè)結(jié)果輸出到images\res文件夾。"""Demo for use yolo v3 """ import os import time import cv2 import numpy as np from model.yolo_model import YOLOdef process_image(img):"""Resize, reduce and expand image.# Argument:img: original image.# Returnsimage: ndarray(64, 64, 3), processed image."""image = cv2.resize(img, (416, 416),interpolation=cv2.INTER_CUBIC)image = np.array(image, dtype='float32')image /= 255.image = np.expand_dims(image, axis=0)return imagedef get_classes(file):"""Get classes name.# Argument:file: classes name for database.# Returnsclass_names: List, classes name."""with open(file) as f:class_names = f.readlines()class_names = [c.strip() for c in class_names]return class_namesdef draw(image, boxes, scores, classes, all_classes):"""Draw the boxes on the image.# Argument:image: original image.boxes: ndarray, boxes of objects.classes: ndarray, classes of objects.scores: ndarray, scores of objects.all_classes: all classes name."""for box, score, cl in zip(boxes, scores, classes):x, y, w, h = boxtop = max(0, np.floor(x + 0.5).astype(int))left = max(0, np.floor(y + 0.5).astype(int))right = min(image.shape[1], np.floor(x + w + 0.5).astype(int))bottom = min(image.shape[0], np.floor(y + h + 0.5).astype(int))cv2.rectangle(image, (top, left), (right, bottom), (255, 0, 0), 2)cv2.putText(image, '{0} {1:.2f}'.format(all_classes[cl], score),(top, left - 6),cv2.FONT_HERSHEY_SIMPLEX,0.6, (0, 0, 255), 1,cv2.LINE_AA)print('class: {0}, score: {1:.2f}'.format(all_classes[cl], score))print('box coordinate x,y,w,h: {0}'.format(box))print()def detect_image(image, yolo, all_classes):"""Use yolo v3 to detect images.# Argument:image: original image.yolo: YOLO, yolo model.all_classes: all classes name.# Returns:image: processed image."""pimage = process_image(image)start = time.time()boxes, classes, scores = yolo.predict(pimage, image.shape)end = time.time()print('time: {0:.2f}s'.format(end - start))if boxes is not None:draw(image, boxes, scores, classes, all_classes)return imagedef detect_vedio(video, yolo, all_classes):"""Use yolo v3 to detect video.# Argument:video: video file.yolo: YOLO, yolo model.all_classes: all classes name."""camera = cv2.VideoCapture(video)cv2.namedWindow("detection", cv2.WINDOW_NORMAL)while True:res, frame = camera.read()if not res:breakimage = detect_image(frame, yolo, all_classes)cv2.imshow("detection", image)if cv2.waitKey(110) & 0xff == 27:breakcamera.release()if __name__ == '__main__':yolo = YOLO(0.6, 0.5)file = 'data/coco_classes.txt'all_classes = get_classes(file)# detect images in test floder.for (root, dirs, files) in os.walk('images/test'):if files:for f in files:print(f)path = os.path.join(root, f)image = cv2.imread(path)image = detect_image(image, yolo, all_classes)cv2.imwrite('images/res/' + f, image)# detect vedio.video = 'E:/video/car.flv'detect_vedio(video, yolo, all_classes)

結(jié)果

運(yùn)行python demo.py
就用這張圖鎮(zhèn)樓了

總結(jié)

以上是生活随笔為你收集整理的YOLO v3解析与实现的全部內(nèi)容，希望文章能夠幫你解決所遇到的問題。

如果覺得生活随笔網(wǎng)站內(nèi)容還不錯(cuò)，歡迎將生活随笔推薦給好友。

Yolo

上一篇： 2021年的高考大约多久可以查询成绩,2
下一篇：计算机科学与技术在广西录取分数线,中国计