當(dāng)前位置：首頁 > 编程资源 > 编程问答 >内容正文

编程问答

YunYang1994/tensorflow-yolov3 Readme 翻译

發(fā)布時間：2025/3/19 编程问答 22 豆豆

生活随笔收集整理的這篇文章主要介紹了 YunYang1994/tensorflow-yolov3 Readme 翻译小編覺得挺不錯的,現(xiàn)在分享給大家,幫大家做個參考.

TensorFlow2.0-Examples/4-Object_Detection/YOLOV3

文章目錄

TensorFlow2.0-Examples/4-Object_Detection/YOLOV3
- Please install tensorflow-gpu 1.11.0 ! Since Tensorflow is fucking ridiculous !
- part 1. Introduction [[代碼剖析]](https://github.com/YunYang1994/CodeFun/blob/master/002-deep_learning/YOLOv3.md)
- part 2. Quick start（快速上手）
- part 3. Train on your own dataset（用你自己的數(shù)據(jù)集進(jìn)行訓(xùn)練）
- - 3.1 Train VOC dataset（訓(xùn)練VOC數(shù)據(jù)集）
  - - how to train it ?（如何訓(xùn)練它）
    - - (1) train from scratch:
      - (2) train from COCO weights(recommend):
    - how to test and evaluate it ?（如何測試和評估呢？）
  - 3.2 Train other dataset（訓(xùn)練其他的數(shù)據(jù)集）
- part 4. Why it is so magical ?（為什么如此神奇？）
- - 4.1 Anchors clustering（錨聚類？）
  - 4.2 Architercutre details（架構(gòu)細(xì)節(jié)）
  - 4.3 Neural network io:（神經(jīng)網(wǎng)絡(luò)的輸入輸出）
  - 4.4 Filtering with score threshold（用分?jǐn)?shù)閾值過濾？）
  - 4.5 Do non-maximum suppression 進(jìn)行非最大抑制
- part 5. Other Implementations

Please install tensorflow-gpu 1.11.0 ! Since Tensorflow is fucking ridiculous !

（請安裝 tensorflow-gpu 1.11.0！因?yàn)?Tensorflow 實(shí)在太 TM 折騰人了！）

part 1. Introduction [代碼剖析]

Implementation of YOLO v3 object detector in Tensorflow. The full details are in this paper. In this project we cover several segments as follows:
（在 Tensorflow 中實(shí)現(xiàn) YOLO v3 對象檢測。完整的細(xì)節(jié)在本文中。在這個項(xiàng)目中，我們涵蓋了以下幾個部分：）

YOLO v3 architecture（YOLO v3 架構(gòu)）
Training tensorflow-yolov3 with GIOU loss function（使用GIOU損失函數(shù)訓(xùn)練tensorflow-yolov3）
Basic working demo（基本工作演示）
Training pipeline（訓(xùn)練流程）
Multi-scale training method（多尺度訓(xùn)練方法）
Compute VOC mAP （計(jì)算預(yù)測 VOC 數(shù)據(jù)集的平均精度均值）

YOLO paper is quite hard to understand, along side that paper. This repo enables you to have a quick understanding of YOLO Algorithmn.
（YOLO 論文超難理解。此庫可使您快速了解 YOLO 算法。）

part 2. Quick start（快速上手）

Clone this file（克隆這個文件）

$ git clone https://github.com/YunYang1994/tensorflow-yolov3.git

You are supposed to install some dependencies before getting out hands with these codes.（在掌握這些代碼之前，應(yīng)該先安裝一些依賴項(xiàng)。）

$ cd tensorflow-yolov3 $ pip install -r ./docs/requirements.txt

Exporting loaded COCO weights as TF checkpoint(yolov3_coco.ckpt)（將已加載的COCO權(quán)重文件導(dǎo)出為Tensorflow的checkpoint文件）

$ cd checkpoint $ wget https://github.com/YunYang1994/tensorflow-yolov3/releases/download/v1.0/yolov3_coco.tar.gz $ tar -xvf yolov3_coco.tar.gz $ cd .. $ python convert_weight.py $ python freeze_graph.py

（大約生成的.pb文件才是最終識別所需要的，正確生成.pb文件需要正確的.names文件、classes數(shù)量要正確，還得修改config.py中的__C.YOLO.CLASSES以及__C.YOLO.ORIGINAL_WEIGHT參數(shù)，需要嚴(yán)格按步驟進(jìn)行，后期，這都需要查看Tensorflow教程。__C.YOLO.DEMO_WEIGHT參數(shù)不知道要不要改，可能運(yùn)行上述指令，它會自動更新。）

Then you will get some .pb files in the root path., and run the demo script（然后，您將在根路徑中獲得一些.pb文件，并運(yùn)行演示腳本。）

$ python image_demo.py $ python video_demo.py # if use camera, set video_path = 0 #（如果使用攝像頭，將video_demo.py中的video_path的值設(shè)置為0，將默認(rèn)調(diào)用電腦或筆記本自帶的攝像頭）

part 3. Train on your own dataset（用你自己的數(shù)據(jù)集進(jìn)行訓(xùn)練）

Two files are required as follows:（需要兩個文件，如下所示：）

dataset.txt:

xxx/xxx.jpg 18.19,6.32,424.13,421.83,20 323.86,2.65,640.0,421.94,20 xxx/xxx.jpg 48,240,195,371,11 8,12,352,498,14 # image_path x_min, y_min, x_max, y_max, class_id x_min, y_min ,..., class_id

class.names:

person bicycle car ... toothbrush

3.1 Train VOC dataset（訓(xùn)練VOC數(shù)據(jù)集）

To help you understand my training process, I made this demo of training VOC PASCAL dataset）
（為了幫助您理解我的訓(xùn)練過程，我制作了這個訓(xùn)練PASCAL VOC 【 Visual Object Classes 可視化對象類】數(shù)據(jù)集的演示）

how to train it ?（如何訓(xùn)練它）

Download VOC PASCAL trainval and test data（）
（下載PASCAL VOC 訓(xùn)練驗(yàn)證數(shù)據(jù)集和測試數(shù)據(jù)集）

$ wget http://host.robots.ox.ac.uk/pascal/VOC/voc2007/VOCtrainval_06-Nov-2007.tar $ wget http://host.robots.ox.ac.uk/pascal/VOC/voc2012/VOCtrainval_11-May-2012.tar $ wget http://host.robots.ox.ac.uk/pascal/VOC/voc2007/VOCtest_06-Nov-2007.tar

Extract all of these tars into one directory and rename them, which should have the following basic structure.
（將所有這些tar包解壓到一個目錄中并重命名它們，該目錄應(yīng)具有以下基本結(jié)構(gòu)。）

VOC # path: /home/yang/test/VOC/ ├── test | └──VOCdevkit | └──VOC2007 (from VOCtest_06-Nov-2007.tar) └── train└──VOCdevkit├──VOC2007 (from VOCtrainval_06-Nov-2007.tar)└──VOC2012 (from VOCtrainval_11-May-2012.tar)$ python scripts/voc_annotation.py --data_path /home/yang/test/VOC

Then edit your ./core/config.py to make some necessary configurations
（然后編輯您的./ core / config.py進(jìn)行一些必要的配置）

__C.YOLO.CLASSES = "./data/classes/voc.names" __C.TRAIN.ANNOT_PATH = "./data/dataset/voc_train.txt" __C.TEST.ANNOT_PATH = "./data/dataset/voc_test.txt"

Here are two kinds of training method:
（這有兩種訓(xùn)練方法：）

(1) train from scratch:

（不使用預(yù)訓(xùn)練模型進(jìn)行訓(xùn)練）

$ python train.py $ tensorboard --logdir ./data

(2) train from COCO weights(recommend):

（使用COCO權(quán)重文件作為預(yù)訓(xùn)練模型進(jìn)行訓(xùn)練【推薦】）

$ cd checkpoint $ wget https://github.com/YunYang1994/tensorflow-yolov3/releases/download/v1.0/yolov3_coco.tar.gz $ tar -xvf yolov3_coco.tar.gz

解壓后會在checkpoint文件夾生成三個文件：

$ cd .. $ python convert_weight.py --train_from_coco

運(yùn)行后會生成四個新的文件：

$ python train.py

執(zhí)行訓(xùn)練的指令后，就會一直跑一直跑，跑N久，我這用1080Ti顯卡跑了好幾天，跑到迭代45次（印象好像是）的時候不跑了，在checkpoint文件夾生成的一堆權(quán)重文件有好幾十個G。權(quán)重文件名上標(biāo)有損失值loss，我挑選出損失值較小的權(quán)重文件保留，刪除了其余的權(quán)重文件。（如圖：我只保留了yolov3_test_loss=8.4732.ckpt-5和yolov3_test_loss=7.8837.ckpt-12）訓(xùn)練過程中會自動更新checkpoint文件，具體干嘛用的我也不是很清楚。（注意：即使是訓(xùn)練相同的迭代次數(shù)，每次訓(xùn)練生成文件的損失值都可能不同。）訓(xùn)練生成的權(quán)重文件會在后續(xù)的識別中使用。

how to test and evaluate it ?（如何測試和評估呢？）

edit your ./core/config.py to make some necessary configurations, the weight file path is the one that you want to test from what we generated in the previous step.
（編輯您的./ core / config.py進(jìn)行一些必要的配置，權(quán)重文件路徑就是我們在上一步中生成的權(quán)重文件的路徑，從中選擇一個您想要測試的）

__C.TEST.WEIGHT_FILE = "./checkpoint/yolov3_test_loss=8.4732.ckpt-5" $ python evaluate.py $ cd mAP $ python main.py -na

運(yùn)行結(jié)果：

if you are still unfamiliar with training pipline, you can join here to discuss with us.
（如果您仍然不熟悉馴良流程，可以從這加入與我們討論。）

3.2 Train other dataset（訓(xùn)練其他的數(shù)據(jù)集）

Download COCO trainval and test data（下載COCO訓(xùn)練驗(yàn)證數(shù)據(jù)集以及測試數(shù)據(jù)集）

$ wget http://images.cocodataset.org/zips/train2017.zip $ wget http://images.cocodataset.org/annotations/annotations_trainval2017.zip $ wget http://images.cocodataset.org/zips/test2017.zip $ wget http://images.cocodataset.org/annotations/image_info_test2017.zip

part 4. Why it is so magical ?（為什么如此神奇？）

YOLO stands for You Only Look Once. It’s an object detector that uses features learned by a deep convolutional neural network to detect an object. Although we has successfully run these codes, we must understand how YOLO works.
（YOLO代表您只需看一次【就是說它識別速度很快！】。它是一種物體檢測器，它使用深度卷積神經(jīng)網(wǎng)絡(luò)學(xué)習(xí)的特征來檢測物體。盡管我們已經(jīng)成功運(yùn)行了這些代碼，但我們必須了解YOLO的工作方式。）

4.1 Anchors clustering（錨聚類？）

The paper suggests to use clustering on bounding box shape to find the good anchor box specialization suited for the data. more details see here
（本文建議對邊界框形狀使用聚類，以找到適合數(shù)據(jù)的良好錨框特化。更多細(xì)節(jié)請看[這里]）

4.2 Architercutre details（架構(gòu)細(xì)節(jié)）

In this project, I use the pretrained weights, where we have 80 trained yolo classes (COCO dataset), for recognition. And the class label is represented as c and it’s integer from 1 to 80, each number represents the class label accordingly. If c=3, then the classified object is a car. The image features learned by the deep convolutional layers are passed onto a classifier and regressor which makes the detection prediction.(coordinates of the bounding boxes, the class label… etc).details also see in the below picture. (thanks Levio for your great image!)
（在這個項(xiàng)目中，我使用預(yù)訓(xùn)練的權(quán)重進(jìn)行識別，在這里我們有80個訓(xùn)練過的yolo類（COCO數(shù)據(jù)集）。并且類別標(biāo)簽用c表示，并且是1到80之間的整數(shù)，每個數(shù)字都相應(yīng)地代表類別標(biāo)簽。如果c = 3，則分類對象是汽車。深度卷積層學(xué)習(xí)到的圖像特征傳遞到分類器和回歸器上，以進(jìn)行檢測預(yù)測。（邊界區(qū)域的坐標(biāo)，類標(biāo)簽等）。詳細(xì)信息也請參見下圖。（感謝Levio提供的NB圖片！））

4.3 Neural network io:（神經(jīng)網(wǎng)絡(luò)的輸入輸出）

input : [None, 416, 416, 3]
output : confidece of an object being present in the rectangle, list of rectangles position and sizes and classes of the objects begin detected. Each bounding box is represented by 6 numbers (Rx, Ry, Rw, Rh, Pc, C1..Cn) as explained above. In this case n=80, which means we have c as 80-dimensional vector, and the final size of representing the bounding box is 85.The first number Pc is the confidence of an project, The second four number bx, by, bw, bh represents the information of bounding boxes. The last 80 number each is the output probability of corresponding-index class.
（如果確定矩形中存在對象，則開始檢測包含矩形位置和大小以及對象的類別的列表。如上所述，每個邊界框由6個數(shù)字“（Rx，Ry，Rw，Rh，Pc，C1…Cn）”表示。在這種情況下，n = 80，這意味著我們將“ c”作為80維向量，并且表示邊界框最終所用到的列表容量大小為85。第一個數(shù)字“ Pc”是項(xiàng)目的置信度，第二組的四個數(shù)字bx，by，bw，bh表示邊界框的信息。每個列表的最后80個是數(shù)字對應(yīng)索引類的輸出概率。【不過我在這里有個疑問，Pc是否屬于后面80個數(shù)字的其中之一？】）

4.4 Filtering with score threshold（用分?jǐn)?shù)閾值過濾？）

The output result may contain several rectangles that are false positives or overlap, if your input image size of [416, 416, 3], you will get (52X52+26X26+13X13)x3=10647 boxes since YOLO v3 totally uses 9 anchor boxes. (Three for each scale). So It is time to find a way to reduce them. The first attempt to reduce these rectangles is to filter them by score threshold.
（輸出結(jié)果可能包含幾個假陽性或重疊的矩形，如果您輸入的圖像尺寸為[416，416，3]，則由于YOLO v3的總和，您將獲得（52X52 + 26X26 + 13X13）x3 = 10647的框。使用9個錨框。（每個刻度三個）。因此，現(xiàn)在是時候找到一種減少它們的方法了。減少這些矩形的第一個嘗試是按得分閾值對其進(jìn)行過濾。）

Input arguments 輸入?yún)?shù):

boxes: tensor of shape 形狀張量 [10647, 4]
scores: tensor of shape [10647, 80] containing the detection scores for 80 classes. 形狀為[[10647，80]]的張量，包含80個類別的檢測分?jǐn)?shù)。
score_thresh: float value , then get rid of whose boxes with low score 浮動值，然后擺脫得分低的框

# Step 1: Create a filtering mask based on "box_class_scores" by using "threshold". # 使用“閾值”基于“ box_class_scores”創(chuàng)建過濾掩碼。 score_thresh=0.4 mask = tf.greater_equal(scores, tf.constant(score_thresh))

4.5 Do non-maximum suppression 進(jìn)行非最大抑制

Even after yolo filtering by thresholding over, we still have a lot of overlapping boxes. Second approach and filtering is Non-Maximum suppression algorithm.
即使在通過閾值進(jìn)行yolo濾波之后，我們?nèi)匀挥泻芏嘀丿B的框。第二種方法和過濾是非最大抑制算法。

Discard all boxes with Pc <= 0.4 丟棄所有Pc <= 0.4的盒子
While there are any remaining boxes 當(dāng)還有剩余的盒子時:
- Pick the box with the largest Pc 選擇具有最大“ Pc”值的盒子
- Output that as a prediction 輸出作為預(yù)測
- Discard any remaining boxes with IOU>=0.5 with the box output in the previous step 丟棄上一步輸出的剩余的盒子中IOU> = 0.5的盒子

In tensorflow, we can simply implement non maximum suppression algorithm like this. more details see here
在tensorflow中，我們可以像這樣簡單地實(shí)現(xiàn)非最大抑制算法。更多細(xì)節(jié)請看這里

for i in range(num_classes):tf.image.non_max_suppression(boxes, score[:,i], iou_threshold=0.5)

Non-max suppression uses the very important function called “Intersection over Union”, or IoU. Here is an exmaple of non maximum suppression algorithm: on input the aglorithm receive 4 overlapping bounding boxes, and the output returns only one
非最大抑制使用非常重要的功能，稱為“交并比” **或IoU。這是非最大抑制算法的一個例子：在輸入算法中，接收4個重疊的邊界框，而輸出僅返回一個

If you want more details, read the fucking source code and original paper or contact with
me！

part 5. Other Implementations

-YOLOv3目標(biāo)檢測有了TensorFlow實(shí)現(xiàn)，可用自己的數(shù)據(jù)來訓(xùn)練

- Implementing YOLO v3 in Tensorflow (TF-Slim)

- YOLOv3_TensorFlow

- Object Detection using YOLOv2 on Pascal VOC2012

-Understanding YOLO

總結(jié)

以上是生活随笔為你收集整理的YunYang1994/tensorflow-yolov3 Readme 翻译的全部內(nèi)容，希望文章能夠幫你解決所遇到的問題。

如果覺得生活随笔網(wǎng)站內(nèi)容還不錯，歡迎將生活随笔推薦給好友。

上一篇： YunYang1994/tensorfl
下一篇：数据训练中 train，val（vali