當(dāng)前位置：首頁(yè) > 人工智能 > 卷积神经网络 >内容正文

卷积神经网络

Day04-经典卷积神经网络解读

發(fā)布時(shí)間：2025/3/21 卷积神经网络 45 豆豆

生活随笔收集整理的這篇文章主要介紹了 Day04-经典卷积神经网络解读小編覺得挺不錯(cuò)的,現(xiàn)在分享給大家,幫大家做個(gè)參考.

Day04-經(jīng)典卷積神經(jīng)網(wǎng)絡(luò)解讀

文章目錄

Day04-經(jīng)典卷積神經(jīng)網(wǎng)絡(luò)解讀
- 作業(yè)說明
- 示例代碼
- - 一、環(huán)境配置
  - 二、數(shù)據(jù)準(zhǔn)備
  - 三、模型配置
  - 四、模型訓(xùn)練
  - 五、模型校驗(yàn)
  - 六、模型預(yù)測(cè)
- 完成作業(yè)

作業(yè)說明

今天的實(shí)戰(zhàn)項(xiàng)目是基于經(jīng)典卷積神經(jīng)網(wǎng)絡(luò) VGG的“口罩分類”。

口罩識(shí)別，是指可以有效檢測(cè)在密集人流區(qū)域中攜帶和未攜戴口罩的所有人臉，同時(shí)判斷該者是否佩戴口罩。通常由兩個(gè)功能單元組成，可以分別完成口罩人臉的檢測(cè)和口罩人臉的分類。

本次實(shí)踐相比生產(chǎn)環(huán)境中口罩識(shí)別的問題，降低了難度，僅實(shí)現(xiàn)人臉口罩判斷模型，可實(shí)現(xiàn)對(duì)人臉是否佩戴口罩的判定。本實(shí)踐旨在通過一個(gè)口罩識(shí)別的案列，讓大家理解和掌握如何使用飛槳?jiǎng)討B(tài)圖搭建一個(gè)經(jīng)典的卷積神經(jīng)網(wǎng)絡(luò)。

特別提示：本實(shí)踐所用數(shù)據(jù)集均來(lái)自互聯(lián)網(wǎng)，請(qǐng)勿用于商務(wù)用途。

作業(yè)要求：

1、根據(jù)課上所學(xué)內(nèi)容，構(gòu)建 VGGNet網(wǎng)絡(luò)并跑通。在此基礎(chǔ)上可嘗試構(gòu)造其他網(wǎng)絡(luò)。
2、思考并動(dòng)手進(jìn)行調(diào)參、優(yōu)化，提高測(cè)試集準(zhǔn)確率。

課件和數(shù)據(jù)集的鏈接請(qǐng)去緒論部分尋找正式學(xué)習(xí)前的緒論

day04文件夾中包含的就是我們所需要的所有數(shù)據(jù)。characterData.zip是我們需要使用的數(shù)據(jù)集，CarID.png是最后用來(lái)測(cè)試效果的圖片。

示例代碼

一、環(huán)境配置

# 導(dǎo)入需要的包import os import zipfile import random import json import paddle import sys import numpy as np from PIL import Image from PIL import ImageEnhance import paddle.fluid as fluid from multiprocessing import cpu_count import matplotlib.pyplot as plt # 參數(shù)配置train_parameters = {"input_size": [3, 224, 224], #輸入圖片的shape"class_dim": -1, #分類數(shù)"src_path":"/home/aistudio/work/maskDetect.zip",#原始數(shù)據(jù)集路徑"target_path":"/home/aistudio/data/", #要解壓的路徑"train_list_path": "/home/aistudio/data/train.txt", #train.txt路徑"eval_list_path": "/home/aistudio/data/eval.txt", #eval.txt路徑"readme_path": "/home/aistudio/data/readme.json", #readme.json路徑"label_dict":{}, #標(biāo)簽字典"num_epochs": 1, #訓(xùn)練輪數(shù)"train_batch_size": 8, #訓(xùn)練時(shí)每個(gè)批次的大小"learning_strategy": { #優(yōu)化函數(shù)相關(guān)的配置"lr": 0.001 #超參數(shù)學(xué)習(xí)率} }

二、數(shù)據(jù)準(zhǔn)備

解壓原始數(shù)據(jù)集

按照比例劃分訓(xùn)練集與驗(yàn)證集

亂序，生成數(shù)據(jù)列表

構(gòu)造訓(xùn)練數(shù)據(jù)集提供器和驗(yàn)證數(shù)據(jù)集提供器

def unzip_data(src_path,target_path):'''解壓原始數(shù)據(jù)集，將src_path路徑下的zip包解壓至data目錄下'''if(not os.path.isdir(target_path + "maskDetect")): z = zipfile.ZipFile(src_path, 'r')z.extractall(path=target_path)z.close() def get_data_list(target_path,train_list_path,eval_list_path):'''生成數(shù)據(jù)列表'''#存放所有類別的信息class_detail = []#獲取所有類別保存的文件夾名稱data_list_path=target_path+"maskDetect/"class_dirs = os.listdir(data_list_path) #總的圖像數(shù)量all_class_images = 0#存放類別標(biāo)簽class_label=0#存放類別數(shù)目class_dim = 0#存儲(chǔ)要寫進(jìn)eval.txt和train.txt中的內(nèi)容trainer_list=[]eval_list=[]#讀取每個(gè)類別，['maskimages', 'nomaskimages']for class_dir in class_dirs:if class_dir != ".DS_Store":class_dim += 1#每個(gè)類別的信息class_detail_list = {}eval_sum = 0trainer_sum = 0#統(tǒng)計(jì)每個(gè)類別有多少?gòu)垐D片class_sum = 0#獲取類別路徑 path = data_list_path + class_dir# 獲取所有圖片img_paths = os.listdir(path)for img_path in img_paths: # 遍歷文件夾下的每個(gè)圖片name_path = path + '/' + img_path # 每張圖片的路徑if class_sum % 10 == 0: # 每10張圖片取一個(gè)做驗(yàn)證數(shù)據(jù)eval_sum += 1 # test_sum為測(cè)試數(shù)據(jù)的數(shù)目eval_list.append(name_path + "\t%d" % class_label + "\n")else:trainer_sum += 1 trainer_list.append(name_path + "\t%d" % class_label + "\n")#trainer_sum測(cè)試數(shù)據(jù)的數(shù)目class_sum += 1 #每類圖片的數(shù)目all_class_images += 1 #所有類圖片的數(shù)目# 說明的json文件的class_detail數(shù)據(jù)class_detail_list['class_name'] = class_dir #類別名稱，如jiangwenclass_detail_list['class_label'] = class_label #類別標(biāo)簽class_detail_list['class_eval_images'] = eval_sum #該類數(shù)據(jù)的測(cè)試集數(shù)目class_detail_list['class_trainer_images'] = trainer_sum #該類數(shù)據(jù)的訓(xùn)練集數(shù)目class_detail.append(class_detail_list) #初始化標(biāo)簽列表train_parameters['label_dict'][str(class_label)] = class_dirclass_label += 1 #初始化分類數(shù)train_parameters['class_dim'] = class_dim#亂序 random.shuffle(eval_list)with open(eval_list_path, 'a') as f:for eval_image in eval_list:f.write(eval_image) random.shuffle(trainer_list)with open(train_list_path, 'a') as f2:for train_image in trainer_list:f2.write(train_image) # 說明的json文件信息readjson = {}readjson['all_class_name'] = data_list_path #文件父目錄readjson['all_class_images'] = all_class_imagesreadjson['class_detail'] = class_detailjsons = json.dumps(readjson, sort_keys=True, indent=4, separators=(',', ': '))with open(train_parameters['readme_path'],'w') as f:f.write(jsons)print ('生成數(shù)據(jù)列表完成！') def custom_reader(file_list):'''自定義reader'''def reader():with open(file_list, 'r') as f:lines = [line.strip() for line in f]for line in lines:img_path, lab = line.strip().split('\t')img = Image.open(img_path) if img.mode != 'RGB': img = img.convert('RGB') img = img.resize((224, 224), Image.BILINEAR)img = np.array(img).astype('float32') img = img.transpose((2, 0, 1)) # HWC to CHW img = img/255 # 像素值歸一化 yield img, int(lab) return reader # 參數(shù)初始化src_path=train_parameters['src_path'] target_path=train_parameters['target_path'] train_list_path=train_parameters['train_list_path'] eval_list_path=train_parameters['eval_list_path'] batch_size=train_parameters['train_batch_size']''' 解壓原始數(shù)據(jù)到指定路徑 ''' unzip_data(src_path,target_path)''' 劃分訓(xùn)練集與驗(yàn)證集，亂序，生成數(shù)據(jù)列表 '''#每次生成數(shù)據(jù)列表前，首先清空train.txt和eval.txtwith open(train_list_path, 'w') as f: f.seek(0)f.truncate() with open(eval_list_path, 'w') as f: f.seek(0)f.truncate() #生成數(shù)據(jù)列表 get_data_list(target_path,train_list_path,eval_list_path)''' 構(gòu)造數(shù)據(jù)提供器 ''' train_reader = paddle.batch(custom_reader(train_list_path),batch_size=batch_size,drop_last=True) eval_reader = paddle.batch(custom_reader(eval_list_path),batch_size=batch_size,drop_last=True)

三、模型配置

VGG的核心是五組卷積操作，每?jī)山M之間做Max-Pooling空間降維。同一組內(nèi)采用多次連續(xù)的3X3卷積，卷積核的數(shù)目由較淺組的64增多到最深組的512，同一組內(nèi)的卷積核數(shù)目是一樣的。卷積之后接兩層全連接層，之后是分類層。由于每組內(nèi)卷積層的不同，有11、13、16、19層這幾種模型，上圖展示一個(gè)16層的網(wǎng)絡(luò)結(jié)構(gòu)。

class ConvPool(fluid.dygraph.Layer):'''卷積+池化'''def __init__(self,num_channels,num_filters,filter_size,pool_size,pool_stride,groups,pool_padding=1,pool_type='max',conv_stride=1,conv_padding=0,act=None):super(ConvPool, self).__init__() self._conv2d_list = []for i in range(groups):conv2d = self.add_sublayer( #返回一個(gè)由所有子層組成的列表。'bb_%d' % i,fluid.dygraph.Conv2D(num_channels=num_channels, #通道數(shù)num_filters=num_filters, #卷積核個(gè)數(shù)filter_size=filter_size, #卷積核大小stride=conv_stride, #步長(zhǎng)padding=conv_padding, #padding大小，默認(rèn)為0act=act))self._conv2d_list.append(conv2d) self._pool2d = fluid.dygraph.Pool2D(pool_size=pool_size, #池化核大小pool_type=pool_type, #池化類型，默認(rèn)是最大池化pool_stride=pool_stride, #池化步長(zhǎng)pool_padding=pool_padding #填充大小)def forward(self, inputs):x = inputsfor conv in self._conv2d_list:x = conv(x)x = self._pool2d(x)return x

請(qǐng)完成 VGG網(wǎng)絡(luò)的定義：

class VGGNet(fluid.dygraph.Layer):'''VGG網(wǎng)絡(luò)'''def __init__(self):super(VGGNet, self).__init__()def forward(self, inputs, label=None):"""前向計(jì)算"""

四、模型訓(xùn)練

all_train_iter=0 all_train_iters=[] all_train_costs=[] all_train_accs=[]def draw_train_process(title,iters,costs,accs,label_cost,lable_acc):plt.title(title, fontsize=24)plt.xlabel("iter", fontsize=20)plt.ylabel("cost/acc", fontsize=20)plt.plot(iters, costs,color='red',label=label_cost) plt.plot(iters, accs,color='green',label=lable_acc) plt.legend()plt.grid()plt.show()def draw_process(title,color,iters,data,label):plt.title(title, fontsize=24)plt.xlabel("iter", fontsize=20)plt.ylabel(label, fontsize=20)plt.plot(iters, data,color=color,label=label) plt.legend()plt.grid()plt.show() ''' 模型訓(xùn)練 ''' # with fluid.dygraph.guard(place = fluid.CUDAPlace(0)): with fluid.dygraph.guard():print(train_parameters['class_dim'])print(train_parameters['label_dict'])vgg = VGGNet()optimizer=fluid.optimizer.AdamOptimizer(learning_rate=train_parameters['learning_strategy']['lr'],parameter_list=vgg.parameters()) for epoch_num in range(train_parameters['num_epochs']):for batch_id, data in enumerate(train_reader()):dy_x_data = np.array([x[0] for x in data]).astype('float32') y_data = np.array([x[1] for x in data]).astype('int64') y_data = y_data[:, np.newaxis]#將Numpy轉(zhuǎn)換為DyGraph接收的輸入img = fluid.dygraph.to_variable(dy_x_data)label = fluid.dygraph.to_variable(y_data)out,acc = vgg(img,label)loss = fluid.layers.cross_entropy(out, label)avg_loss = fluid.layers.mean(loss)#使用backward()方法可以執(zhí)行反向網(wǎng)絡(luò)avg_loss.backward()optimizer.minimize(avg_loss)#將參數(shù)梯度清零以保證下一輪訓(xùn)練的正確性vgg.clear_gradients()all_train_iter=all_train_iter+train_parameters['train_batch_size']all_train_iters.append(all_train_iter)all_train_costs.append(loss.numpy()[0])all_train_accs.append(acc.numpy()[0])if batch_id % 1 == 0:print("Loss at epoch {} step {}: {}, acc: {}".format(epoch_num, batch_id, avg_loss.numpy(), acc.numpy()))draw_train_process("training",all_train_iters,all_train_costs,all_train_accs,"trainning cost","trainning acc") draw_process("trainning loss","red",all_train_iters,all_train_costs,"trainning loss")draw_process("trainning acc","green",all_train_iters,all_train_accs,"trainning acc") #保存模型參數(shù)fluid.save_dygraph(vgg.state_dict(), "vgg") print("Final loss: {}".format(avg_loss.numpy()))

五、模型校驗(yàn)

''' 模型校驗(yàn) ''' with fluid.dygraph.guard():model, _ = fluid.load_dygraph("vgg")vgg = VGGNet()vgg.load_dict(model)vgg.eval()accs = []for batch_id, data in enumerate(eval_reader()):dy_x_data = np.array([x[0] for x in data]).astype('float32')y_data = np.array([x[1] for x in data]).astype('int')y_data = y_data[:, np.newaxis]img = fluid.dygraph.to_variable(dy_x_data)label = fluid.dygraph.to_variable(y_data)out, acc = vgg(img, label)lab = np.argsort(out.numpy())accs.append(acc.numpy()[0]) print(np.mean(accs))

六、模型預(yù)測(cè)

def load_image(img_path):'''預(yù)測(cè)圖片預(yù)處理'''img = Image.open(img_path) if img.mode != 'RGB': img = img.convert('RGB') img = img.resize((224, 224), Image.BILINEAR)img = np.array(img).astype('float32') img = img.transpose((2, 0, 1)) # HWC to CHW img = img/255 # 像素值歸一化 return imglabel_dic = train_parameters['label_dict']''' 模型預(yù)測(cè) '''with fluid.dygraph.guard():model, _ = fluid.dygraph.load_dygraph("vgg")vgg = VGGNet()vgg.load_dict(model)vgg.eval()#展示預(yù)測(cè)圖片infer_path='/home/aistudio/data/data23615/infer_mask01.jpg'img = Image.open(infer_path)plt.imshow(img) #根據(jù)數(shù)組繪制圖像plt.show() #顯示圖像#對(duì)預(yù)測(cè)圖片進(jìn)行預(yù)處理infer_imgs = []infer_imgs.append(load_image(infer_path))infer_imgs = np.array(infer_imgs)for i in range(len(infer_imgs)):data = infer_imgs[i]dy_x_data = np.array(data).astype('float32')dy_x_data=dy_x_data[np.newaxis,:, : ,:]img = fluid.dygraph.to_variable(dy_x_data)out = vgg(img)lab = np.argmax(out.numpy()) #argmax():返回最大數(shù)的索引print("第{}個(gè)樣本,被預(yù)測(cè)為：{}".format(i+1,label_dic[str(lab)]))print("結(jié)束")

完成作業(yè)

定義 VGG網(wǎng)絡(luò)：

代碼和前幾天相似，但今天的示例代碼將整個(gè)模型眾多的參數(shù)進(jìn)行了統(tǒng)一的封裝處理，并增加了一個(gè) ConvPool類，將卷積和池化合在一起了，這樣會(huì)比較方便一些，我們也就這么使用了。

class VGGNet(fluid.dygraph.Layer):'''VGG網(wǎng)絡(luò)'''def __init__(self):super(VGGNet, self).__init__()# 通道數(shù)、卷積核個(gè)數(shù)、卷積核大小、池化核大小、池化步長(zhǎng)、連續(xù)卷積個(gè)數(shù)self.convpool01 = ConvPool(3, 64, 3, 2, 2, 2, act='relu')self.convpool02 = ConvPool(64, 128, 3, 2, 2, 2, act='relu')self.convpool03 = ConvPool(128, 256, 3, 2, 2, 3, act='relu')self.convpool04 = ConvPool(256, 512, 3, 2, 2, 3, act='relu')self.convpool05 = ConvPool(512, 512, 3, 2, 2, 3, act='relu')self.pool_5_shape = 512*7*7self.fc01 = fluid.dygraph.Linear(self.pool_5_shape, 4096, act='relu')self.fc02 = fluid.dygraph.Linear(4096, 4096, act='relu')self.fc03 = fluid.dygraph.Linear(4096, 2, act='softmax')def forward(self, inputs, label=None):"""前向計(jì)算"""out = self.convpool01(inputs)out = self.convpool02(out)out = self.convpool03(out)out = self.convpool04(out)out = self.convpool05(out)out = fluid.layers.reshape(out, shape=[-1, 512*7*7])out = self.fc01(out)out = self.fc02(out)out = self.fc03(out)if label is not None:acc = fluid.layers.accuracy(input=out, label=label)return out, accelse:return out

我們初始采取的訓(xùn)練輪數(shù)是 10，可以看到模型訓(xùn)練的準(zhǔn)確率是忽上忽下的。

繪制出來(lái)的圖像也是如此，訓(xùn)練模型的準(zhǔn)確率一直在震蕩。

測(cè)試集上的準(zhǔn)確率在 0.6左右。

口罩識(shí)別是個(gè)二分類問題，結(jié)果只有戴口罩和不戴口罩兩種。我們用口罩圖片進(jìn)行預(yù)測(cè)時(shí)還勉強(qiáng)能預(yù)測(cè)成功。

準(zhǔn)確率只有 0.6 左右我們肯定還是要想著去優(yōu)化的，從以下三個(gè)方面進(jìn)行了調(diào)參。

訓(xùn)練輪數(shù)，也就是迭代次數(shù)（num_epochs）
學(xué)習(xí)率（learningrate）
訓(xùn)練時(shí)各批次大小（batch_size）

我們?cè)黾恿擞?xùn)練輪數(shù)，增加為 20輪次，下調(diào)了學(xué)習(xí)率，降至0.0001，增加了訓(xùn)練時(shí)每個(gè)批次的大小，增加為 16。這些參數(shù)在第一步的環(huán)境配置中就可以修改。

這里強(qiáng)調(diào)一句，稍微調(diào)高訓(xùn)練批次大小是可以提高準(zhǔn)確率的，但是batch_size的大小不是隨便調(diào)的，一般是 8 的倍數(shù)，這樣 GPU 內(nèi)部的并行運(yùn)算效率最高。

訓(xùn)練之后可以看到，訓(xùn)練集上的準(zhǔn)確率逐漸收斂為 1.0。

測(cè)試集上的準(zhǔn)確率也達(dá)到了 1.0，很nice。

因?yàn)榇舜蔚膶W(xué)習(xí)數(shù)據(jù)比較少，模型的泛化能力不強(qiáng)，群里的大佬們反饋說，可以適當(dāng)采用 數(shù)據(jù)增強(qiáng)（Data Augmentation） 的方法來(lái)提高真實(shí)場(chǎng)景下預(yù)測(cè)的成功率。

總結(jié)

以上是生活随笔為你收集整理的Day04-经典卷积神经网络解读的全部?jī)?nèi)容，希望文章能夠幫你解決所遇到的問題。

如果覺得生活随笔網(wǎng)站內(nèi)容還不錯(cuò)，歡迎將生活随笔推薦給好友。

上一篇： LeetCode5377. 将二进制表示
下一篇： Ubuntu安装HBase2.2.4并进