當(dāng)前位置：首頁 > 运维知识 > windows >内容正文

windows

构建自动车牌识别系统

發(fā)布時間：2024/3/26 windows 33 豆豆

生活随笔收集整理的這篇文章主要介紹了构建自动车牌识别系统小編覺得挺不錯的,現(xiàn)在分享給大家,幫大家做個參考.

本文介紹了如何從零開始開發(fā)車牌對象檢測模型。整體項目中還包含了一個使用Flask的API。在本文中我們將解釋如何從頭開始訓(xùn)練自定義對象檢測模型。

項目架構(gòu)

現(xiàn)在，讓我們看看我們要構(gòu)建的車牌識別和OCR的項目架構(gòu)。

在上面的架構(gòu)中，有六個模塊。標(biāo)記、訓(xùn)練、保存模型、OCR和模型管道，以及RESTful API。但是本文只詳細(xì)介紹前三個模塊。過程如下。首先，我們將收集圖像。然后使用python GUI開發(fā)的開源軟件圖像標(biāo)注工具對圖像進(jìn)行車牌或號牌的標(biāo)注。然后在對圖像進(jìn)行標(biāo)記后，我們將進(jìn)行數(shù)據(jù)預(yù)處理，在TensorFlow 2中構(gòu)建和訓(xùn)練一個深度學(xué)習(xí)目標(biāo)檢測模型(Inception Resnet V2)。完成目標(biāo)檢測模型訓(xùn)練過程后，使用該模型裁剪包含車牌的圖像，也稱為關(guān)注區(qū)域（ROI），并將該ROI傳遞給Python中的 Tesserac API。使用PyTesseract，我們將從圖像中提取文本。最后我們將所有這些放在一起，并構(gòu)建深度學(xué)習(xí)模型管道。在最后一個模塊中，將使用FLASK Python創(chuàng)建一個Web應(yīng)用程序項目。這樣，我們可以將我們的應(yīng)用程序發(fā)布供他人使用。

標(biāo)注

為了建立車牌識別，我們需要數(shù)據(jù)。為此，我們需要收集車牌出現(xiàn)在其上的車輛圖像。這對于圖像標(biāo)簽，我使用了LabelImg圖像標(biāo)注工具。從GitHub下載labelImg并按照說明安裝軟件包。打開之后，GUI給出指示，然后單擊CreateRectBox并繪制如下所示的矩形框，然后將輸出保存為XML。

pip install pyqt=5 pip install lxml pyrcc5 -o libs/resources.py resources.qrc python labelImg.py python labelImg.py [IMAGE_PATH] [PRE-DEFINED CLASS FILE

這是一個手動過程，您需要對所有圖像進(jìn)行處理。標(biāo)注時要注意，因為這個過程會直接影響模型的準(zhǔn)確性。

從XML解析信息

完成標(biāo)注過程后，現(xiàn)在我們需要進(jìn)行一些數(shù)據(jù)預(yù)處理。

由于標(biāo)注的輸出是XML，為了將其用于訓(xùn)練過程，我們需要處理格式數(shù)據(jù)。因此我們將從標(biāo)簽中獲得有用的信息，例如它的邊界框的對角點，分別是xmin，ymin，xmax，ymax，如圖3所示，我們需要提取信息并將其保存為任何方便的格式，在這里，我將邊界信息轉(zhuǎn)換為CSV，隨后，我將使用Pandas將其轉(zhuǎn)換為數(shù)組。現(xiàn)在，讓我們看看如何使用Python解析信息。

我使用xml.etree python庫來解析XML中的數(shù)據(jù)，并導(dǎo)入pandas和glob。首先使用glob獲取在標(biāo)記過程中生成的所有XML文件。

import pandas as pd from glob import glob import xml.etree.ElementTree as xetpath = glob('./images/*.xml')labels_dict = dict(filepath=[],xmin=[],xmax=[],ymin=[],ymax=[]) for filename in path:info = xet.parse(filename)root = info.getroot()member_object = root.find('object')labels_info = member_object.find('bndbox')xmin = int(labels_info.find('xmin').text)xmax = int(labels_info.find('xmax').text)ymin = int(labels_info.find('ymin').text)ymax = int(labels_info.find('ymax').text)#print(xmin,xmax,ymin,ymax)labels_dict['filepath'].append(filename)labels_dict['xmin'].append(xmin)labels_dict['xmax'].append(xmax)labels_dict['ymin'].append(ymin)labels_dict['ymax'].append(ymax)

在上面的代碼中，我們分別獲取每個文件并將其解析為xml.etree，然后找到對象-> bndbox，它位于第2至7行。然后提取xmin，xmax，ymin，ymax并將這些值保存在字典中在第8至17行中。然后，將其轉(zhuǎn)換為pandas的df，并將其保存到CSV文件中，如下所示。

df = pd.DataFrame(labels_dict) df.to_csv('labels.csv',index=False) df.head()

通過以上代碼，我們成功提取了每個圖像的對角線位置，并將數(shù)據(jù)從非結(jié)構(gòu)化格式轉(zhuǎn)換為結(jié)構(gòu)化格式。

現(xiàn)在，我們來提取XML的相應(yīng)圖像文件名。

import osdef getFilename(filename):filename_image = xet.parse(filename).getroot().find('filename').textfilepath_image = os.path.join('./images',filename_image)return filepath_imageimage_path = list(df['filepath'].apply(getFilename)) image_path

驗證數(shù)據(jù)

到目前為止，我們都是進(jìn)行的手動處理，因此重要的是要驗證所獲得的信息是否有效。我們只需驗證邊界框?qū)τ诮o定圖像正確顯示。

file_path = "N1.jpeg" xmin,xmax,ymin,ymax = 1093,1396,645,727 img = cv2.imread(file_path) cv2.rectangle(img,(xmin,ymin),(ymin,ymax),(0,255,0),3) cv2.namedWindow('example',cv2.WINDOW_NORMAL) cv2.imshow('example',img) cv2.waitKey(0) cv2.destroyAllWindows()

數(shù)據(jù)處理

這是非常重要的一步，在此過程中，我們將獲取每張圖像，并使用OpenCV將其轉(zhuǎn)換為數(shù)組，然后將圖像調(diào)整為224 x 224，這是預(yù)訓(xùn)練的轉(zhuǎn)移學(xué)習(xí)模型的標(biāo)準(zhǔn)兼容尺寸。

from sklearn.model_selection import train_test_split from tensorflow.keras.preprocessing.image import load_img, img_to_array import cv2 import numpy as nplabels = df.iloc[:,1:].valuesdata = [] output = [] for ind in range(len(image_path)):image = image_path[ind]img_arr = cv2.imread(image)h,w,d = img_arr.shape# prepprocesingload_image = load_img(image,target_size=(224,224))load_image_arr = img_to_array(load_image)norm_load_image_arr = load_image_arr/255.0 # normalization# normalization to labelsxmin,xmax,ymin,ymax = labels[ind]nxmin,nxmax = xmin/w,xmax/wnymin,nymax = ymin/h,ymax/hlabel_norm = (nxmin,nxmax,nymin,nymax) # normalized output# -------------- appenddata.append(norm_load_image_arr)output.append(label_norm)

我們將通過除以最大數(shù)量來歸一化圖像，因為我們知道8位圖像的最大數(shù)量為 255

我們還需要對標(biāo)簽進(jìn)行規(guī)范化。因為對于深度學(xué)習(xí)模型，輸出范圍應(yīng)該在0到1之間。為了對標(biāo)簽進(jìn)行歸一化，我們需要將對角點除以圖像的寬度和高度。

X = np.array(data,dtype=np.float32) y = np.array(output,dtype=np.float32)

sklearn的函數(shù)可以方便的將數(shù)據(jù)分為訓(xùn)練和測試集。

x_train,x_test,y_train,y_test = train_test_split(X,y,train_size=0.8,random_state=0) x_train.shape,x_test.shape,y_train.shape,y_test.shape

訓(xùn)練

現(xiàn)在我們已經(jīng)可以準(zhǔn)備訓(xùn)練用于對象檢測的深度學(xué)習(xí)模型了。本篇文章中，我們將使用具有預(yù)訓(xùn)練權(quán)重的InceptionResNetV2模型，并將其訓(xùn)練到我們的數(shù)據(jù)中。首先從TensorFlow 2.3.0導(dǎo)入必要的庫

from tensorflow.keras.applications import InceptionResNetV2 from tensorflow.keras.layers import Dense, Dropout, Flatten, Input from tensorflow.keras.models import Model import tensorflow as tf

我們需要的是一個對象檢測模型，而期望的輸出數(shù)量是4（對角點的信息）。我們將在遷移學(xué)習(xí)模型中添加一個嵌入神經(jīng)網(wǎng)絡(luò)層，如第5至9行所示。

inception_resnet = InceptionResNetV2(weights="imagenet",include_top=False,input_tensor=Input(shape=(224,224,3))) inception_resnet.trainable=False # --------------------- headmodel = inception_resnet.output headmodel = Flatten()(headmodel) headmodel = Dense(500,activation="relu")(headmodel) headmodel = Dense(250,activation="relu")(headmodel) headmodel = Dense(4,activation='sigmoid')(headmodel) # ---------- model model = Model(inputs=inception_resnet.input,outputs=headmodel)

現(xiàn)在編譯模型并訓(xùn)練模型

# complie model model.compile(loss='mse',optimizer=tf.keras.optimizers.Adam(learning_rate=1e-4)) model.summary()from tensorflow.keras.callbacks import TensorBoard tfb = TensorBoard('object_detection') history = model.fit(x=x_train,y=y_train,batch_size=10,epochs=200,validation_data=(x_test,y_test),callbacks=[tfb])

我們訓(xùn)練模型通常需要3到4個小時，具體取決于計算機(jī)的速度。在這里,我們使用TensorBoard記錄了中模型訓(xùn)練時的損失。

進(jìn)行邊界框預(yù)測

這是最后一步。在這一步中，我們將所有這些放在一起并獲得給定圖像的預(yù)測。

# create pipeline path = './test_images/N207.jpeg' def object_detection(path):# read imageimage = load_img(path) # PIL objectimage = np.array(image,dtype=np.uint8) # 8 bit array (0,255)image1 = load_img(path,target_size=(224,224))# data preprocessingimage_arr_224 = img_to_array(image1)/255.0 # convert into array and get the normalized outputh,w,d = image.shapetest_arr = image_arr_224.reshape(1,224,224,3)# make predictionscoords = model.predict(test_arr)# denormalize the valuesdenorm = np.array([w,w,h,h])coords = coords * denormcoords = coords.astype(np.int32)# draw bounding on top the imagexmin, xmax,ymin,ymax = coords[0]pt1 =(xmin,ymin)pt2 =(xmax,ymax)print(pt1, pt2)cv2.rectangle(image,pt1,pt2,(0,255,0),3)return image, coords# ------ get prediction path = './test_images/N207.jpeg' image, cods = object_detection(path)plt.figure(figsize=(10,8)) plt.imshow(image) plt.show()

本文僅說明了項目架構(gòu)的50％。下一個過程涉及從車牌中提取文本并在Flask中開發(fā)RestfulAPI。這里是完整項目的輸出

作者：DEVI GUSKRA

deephub翻譯組

總結(jié)

以上是生活随笔為你收集整理的构建自动车牌识别系统的全部內(nèi)容，希望文章能夠幫你解決所遇到的問題。

如果覺得生活随笔網(wǎng)站內(nèi)容還不錯，歡迎將生活随笔推薦給好友。

上一篇：机器学习——时间序列模型
下一篇： C语言实现斐波那契（ Fibonacc