當前位置：首頁 > 人工智能 > 目标检测 >内容正文

目标检测

使用opencv训练cascade分类器进行目标检测

發布時間：2025/3/20 目标检测 65 豆豆

生活随笔收集整理的這篇文章主要介紹了使用opencv训练cascade分类器进行目标检测小編覺得挺不錯的,現在分享給大家,幫大家做個參考.

文章目錄

- - 0.建立訓練目錄
  - 1.建立正負樣本
  - 2.生成正負樣本的txt文件
  - 3.生成 pos.vec描述文件
  - 4.訓練cascade分類器
  - 5.目標檢測
  - 總結

0.建立訓練目錄

文件夾： train/

1.建立正負樣本

新建3個文件夾：train/pos/、train/neg/、train/xml/

pos文件夾：
放置正樣本，尺寸要一致：如 $20 ? 20$ (一般用于Haar特征)， $24 ? 24$ （LBP特征）

neg文件夾：
放置負樣本，正樣本的尺寸要保證不大于負樣本的尺寸

xml文件夾：
級聯分類器xml文件的輸出目錄

2.生成正負樣本的txt文件

生成正樣本txt文件 train/pos/pos.txt：

內容：圖片名類別編號左上角x 左上角y 右下角x 右下角y
pos_image1.png 1 0 0 30 30
pos_image2.png 1 0 0 30 30
… …

生成負樣本txt文件 train/neg/neg.txt：

內容：圖片路徑名
neg/neg_image1.png
neg/neg_image2.png
… …

import osdef _get_directory_files(path, fileType, filePaths):if not os.path.exists(path):returnfiles = os.listdir(path)for f in files:npath = path + '/' + fif (os.path.isfile(npath)):if (os.path.splitext(npath)[1] == fileType): filePaths.append(f)if (os.path.isdir(npath)):if (f[0] == '.'):passelse:_get_directory_files(npath, fileType, filePaths)returndef _write_txt(txt_path,img_dir,is_pos,img_size,neg_dir='neg'):imgpaths=[]_get_directory_files(img_dir,'.png',imgpaths)with open(txt_path, "a") as f:for img_p in imgpaths:if is_pos:f.write('%s 1 0 0 %d %d\n' % (img_p,img_size[0],img_size[1]))else:f.write('%s/%s\n' % (neg_dir, img_p))returnif __name__ == "__main__":txt_pos = './train/pos/pos.txt'txt_neg = './train/neg/neg.txt'pos_img_dir = './train/pos'neg_img_dir = './train/neg'_write_txt(txt_pos,pos_img_dir,True,(30,30))_write_txt(txt_neg,neg_img_dir,False,(30,30))

3.生成 pos.vec描述文件

在 train/ 目錄下，運行命令：

opencv_createsamples -vec pos.vec -info pos/pos.txt -bg neg/neg.txt -num 10000 -w 30 -h 30

命令參數如下：

info 輸入正樣本描述文件，默認NULL
img 輸入圖像文件名，默認NULL
bg 負樣本描述文件，文件中包含一系列的被隨機選作物體背景的圖像文件名，默認NULL
num 生成正樣本的數目，默認1000
bgcolor 背景顏色，表示透明顏色，默認0
bgthresh 顏色容差，所有處于bgcolor-bgthresh和bgcolor+bgthresh之間的像素被置為透明像素，也就是將白噪聲加到前景圖像上，默認80
inv 前景圖像顏色翻轉標志，如果指定顏色翻轉，默認0(不翻轉)
randinv 如果指定顏色將隨機翻轉，默認0
maxidev 前景圖像中像素的亮度梯度最大值，默認40
maxxangle X軸最大旋轉角度，以弧度為單位，默認1.1
maxyangle Y軸最大旋轉角度，以弧度為單位，默認1.1
maxzangle Z軸最大旋轉角度，以弧度為單位，默認0.5
輸入圖像沿著三個軸進行旋轉，旋轉角度由上述3個值限定。
show 如果指定，每個樣本都將被顯示，按下Esc鍵，程序將繼續創建樣本而不在顯示，默認為0(不顯示)
scale 顯示圖像的縮放比例，默認4.0
w 輸出樣本寬度，默認24
h 輸出樣本高度，默認24
vec 輸出用于訓練的.vec文件，默認NULL

4.訓練cascade分類器

在 train/ 目錄下，運行命令：

opencv_traincascade -data xml -vec pos.vec -bg neg/neg.txt -numPos 8000 -numNeg 16000 -numStages 20 -featureType LBP -w 30 -h 30

命令參數如下：

data 目錄名xml，存放訓練好的分類器，如果不存在訓練程序自行創建
vec pos.vec文件，由opencv_createsamples生成
bg 負樣本描述文件, neg/neg.txt
numPos 每級分類器訓練時所用到的正樣本數目。
應當注意，這個數值一定要比正樣本時的總數少，不然會報can not get new positive sample.理由：minHitRate：影響每個強分類器閾值，當設置為0.95時如果正訓練樣本個數為10000個，那么其中的500個就很可能背叛別為負樣本，第二次選擇的時候必須多選擇后面的500個，按照這種規律我們為后面的每級多增加numPos*minHitRate個正樣本，根據訓練的級數可以得到如下公式:
$n u m P o s + （ n u m S t a g e s ? 1 ） ? n u m P o s ? （ 1 ? m i n H i t R a t e ） < = 準備的訓練樣本$
numNeg 每級分類器訓練時所用到的負樣本數目，可以大于-bg指定的圖片數目
numStages 訓練分類器的級數，默認20級，一般在14-25層之間均可。
如果層數過多，分類器的fals alarm就更小，但是產生級聯分類器的時間更長，分類器的hitrate就更小，檢測速度就慢。如果正負樣本較少，層數沒必要設置很多。
precalcValBufSize 緩存大小，用于存儲預先計算的特征值，單位MB
precalcIdxBufSize 緩存大小，用于存儲預先計算的特征索引，單位MB
baseFormatSave 僅在使用Haar特征時有效，如果指定，級聯分類器將以老格式存儲
stageType 級聯類型，{ CC_BOOST }
featureType 特征類型，目前只支持LBP、HOG、Haar三種特征。但是HAAR訓練非常非常的慢，而LBP則相對快很多，因為HAAR需要浮點運算，精度自然比LBP更高，但是LBP的效果也基本能達到HAAR的效果，推薦使用LBP。
w，h 訓練樣本的尺寸，必須跟使用opencv_createsamples創建的訓練樣本尺寸保持一致,并且-w和-h的比例必須符合真實目標的比例.
bt Boosted分類器類型，{DAB-discrete Adaboost, RAB-RealAdaboost, LB-LogiBoost, GAB-Gentle Adaboost}
minHitRate 分類器的每一級希望得到的最小檢測率，總的最大檢測率大約為min_hit_rate^number_of_stages
maxFalseAlarmRate 分類器的每一級希望得到的最大誤檢率，總的誤檢率大約為max_false_rate^number_of_stages
weightTrimRate Specifies whether trimming should beused and its weight. 一個還不錯的數值是0.95
maxDepth 弱分類器的最大深度，一個不錯數值是1，二叉樹
maxWeightCount 每一級中弱分類器的最大數目
mode 訓練過程使用的Haar特征類型，有BASIC/CORE/ALL三種特征組合待選的，默認情況為BASIC，三種情況下對應的特征選取分別如下：

5.目標檢測

detectMultiScale()函數參數：

cv2.CascadeClassifier.detectMultiScale(image[, scaleFactor[, minNeighbors[, flags[, minSize[, maxSize]]]]]) → objectscv2.CascadeClassifier.detectMultiScale(image, rejectLevels, levelWeights[, scaleFactor[, minNeighbors[, flags[, minSize[, maxSize[, outputRejectLevels]]]]]]) → objects

參數如下：

image： Matrix of the type CV_8U containing an image where objects are detected. 灰度圖
objects：Vector of rectangles where each rectangle contains the detected object.
scaleFactor：Parameter specifying how much the image size is reduced at each image scale. 圖像尺度參數，默認1.1
minNeighbors：Parameter specifying how many neighbors each candidate rectangle should have to retain it. 為每一個級聯矩形應該保留的臨近個數，默認為3，即至少有3次檢測到目標，才認為是目標。
flags：Parameter with the same meaning for an old cascade as in the function cvHaarDetectObjects. It is not used for a new cascade.
CV_HAAR_DO_CANNY_PRUNING，利用邊緣檢測來排除一些邊緣很少或者很多的圖像區域
CV_HAAR_SCALE_IMAGE，按正常比例檢測
CV_HAAR_FIND_GIGGEST_OBJECT，只檢測最大的物體
CV_HAAR_DO_ROUGH_SEARCH，只做粗略檢測，默認值為0
minSize – Minimum possible object size. Objects smaller than that are ignored.
maxSize – Maximum possible object size. Objects larger than that are ignored.

import cv2, time import numpy as np import os.pathdef get_hw_by_short_size(im_height, im_width, resize):short_size, max_size = resizeim_size_min = np.min([im_height, im_width])im_size_max = np.max([im_height, im_width])scale = (short_size + 0.0) / im_size_minif scale * im_size_max > max_size:scale = (max_size + 0.0) / im_size_maxresized_height, resized_width = int(round(im_height * scale)), int(round(im_width * scale))return resized_height, resized_widthclass car_detector:def __init__(self, cascade_file, max_detect_hw=(400, 600)):if not os.path.isfile(cascade_file):raise RuntimeError("%s: not found" % cascade_file)self._cascade = cv2.CascadeClassifier(cascade_file)self._max_detect_hw = max_detect_hwdef detect_image(self, image):gray = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)gray = cv2.equalizeHist(gray)cars = self._cascade.detectMultiScale(gray, scaleFactor=1.3, minNeighbors=15, minSize=(60, 60))for (x, y, w, h) in cars:cv2.rectangle(image, (x, y), (x + w, y + h), (0, 0, 255), 2)return imagedef detect_video(self, video_path, start_frame, end_frame, ):cap = cv2.VideoCapture(video_path)cap.set(cv2.CAP_PROP_POS_FRAMES, start_frame)org_w = int(cap.get(cv2.CAP_PROP_FRAME_WIDTH))org_h = int(cap.get(cv2.CAP_PROP_FRAME_HEIGHT))h, w = get_hw_by_short_size(org_h, org_w, self._max_detect_hw)while (start_frame < end_frame):start_frame += 1ret, image = cap.read()if not ret: returnresized_img = cv2.resize(image, (w, h), interpolation=cv2.INTER_CUBIC)result = self.detect_image(resized_img)cv2.imshow("Detect", result)cv2.waitKey(1)if __name__ == "__main__": car_cascade_lbp_21 = './train/xml/cascade.xml'video_path = "./test.mp4"start_frame = 0end_frame = 300detect = car_detector(car_cascade_lbp_21) detect.detect_video(video_path, start_frame, end_frame)

總結

車輛檢測，在訓練階段：

HOG特征：
正樣本尺寸 $30 ? 30$ ，訓練速度非常快，結果不收斂。
正樣本尺寸 $64 ? 64$ ，訓練速度較快，結構收斂。
但是， OpenCV 3.x 中， CascadeClassifier方法不支持 HOG特征。

HAAR特征：
正樣本尺寸 $20 ? 20$ ，訓練速度非常慢，結果不收斂。

LBP特征：
正樣本尺寸 $30 ? 30$ ，訓練速度較快，結果收斂。

總結

以上是生活随笔為你收集整理的使用opencv训练cascade分类器进行目标检测的全部內容，希望文章能夠幫你解決所遇到的問題。

如果覺得生活随笔網站內容還不錯，歡迎將生活随笔推薦給好友。