【神经网络】(6) 卷积神经网络(VGG16),案例:鸟类图片4分类
各位同學好,今天和大家分享一下TensorFlow2.0中的VGG16卷積神經網絡模型,案例:現在有四種鳥類的圖片各200張,構建卷積神經網絡,預測圖片屬于哪個分類。
1. 數據加載
將鳥類圖片按類分開存放,使用tf.keras.preprocessing.image_dataset_from_directory()函數分批次讀取圖片數據,統一指定圖片加載進來的大小224*224,指定參數label_model,'int'代表目標值y是數值類型,即0, 1, 2, 3等;'categorical'代表onehot類型,對應索引的值為1,如圖像屬于第二類則表示為0,1,0,0,0;'binary'代表二分類。
import numpy as np
import tensorflow as tf
from tensorflow import keras
from tensorflow.keras import Sequential, optimizers, layers, Model#(1)數據加載
def get_data(height, width, batchsz):# 獲取訓練集數據filepath1 = 'C:/Users/admin/.spyder-py3/test/數據集/4種鳥分類/new_data/train'train_ds = tf.keras.preprocessing.image_dataset_from_directory(filepath1, # 指定訓練集數據路徑label_mode = 'categorical', # 進行onehot編碼image_size = (height, width), # 對圖像risizebatch_size = batchsz, # 每次迭代取32個數據 )# 獲取驗證集數據filepath2 = 'C:/Users/admin/.spyder-py3/test/數據集/4種鳥分類/new_data/val'val_ds = tf.keras.preprocessing.image_dataset_from_directory(filepath1, # 指定訓練集數據路徑label_mode = 'categorical', image_size = (height, width), # 對圖像risizebatch_size = batchsz, # 每次迭代取32個數據 ) # 獲取測試集數據filepath2 = 'C:/Users/admin/.spyder-py3/test/數據集/4種鳥分類/new_data/test'test_ds = tf.keras.preprocessing.image_dataset_from_directory(filepath1, # 指定訓練集數據路徑label_mode = 'categorical', image_size = (height, width), # 對圖像risizebatch_size = batchsz, # 每次迭代取32個數據 ) # 返回數據集return train_ds, val_ds, test_ds# 數據讀取函數,返回訓練集、驗證集、測試集
train_ds, val_ds, test_ds = get_data(height=224, width=224, batchsz=32)
# 查看有哪些分類
class_names = train_ds.class_names
print('類別有:', class_names)
# 類別有: ['Bananaquit', 'Black Skimmer', 'Black Throated Bushtiti', 'Cockatoo']# 查看數據集信息
sample = next(iter(train_ds)) #每次取出一個batch的訓練數據
print('x_batch.shape:', sample[0].shape, 'y_batch.shape:',sample[1].shape)
# x_batch.shape: (128, 224, 224, 3) y_batch.shape: (128, 4)
print('y[:5]:', sample[1][:5]) # 查看前五個目標值
2. 數據預處理
使用.map()函數轉換數據集中所有x和y的類型,并將每張圖象的像素值映射到[-1,1]之間,打亂訓練集數據的順序.shuffle(),但不改變特征值x和標簽值y之間的對應關系。iter()生成迭代器,配合next()每次運行取出訓練集中的一個batch數據。
#(2)顯示圖像
import matplotlib.pyplot as plt
for i in range(15):plt.subplot(3,5,i+1)plt.imshow(sample[0][i]/255.0) # sample[0]代表取出的一個batch的所有圖像信息,映射到[0,1]之間顯示圖像plt.xticks([]) # 不顯示xy軸坐標刻度plt.yticks([])
plt.show()#(3)數據預處理
# 定義預處理函數
def processing(x, y):x = 2 * tf.cast(x, dtype=tf.float32)/255.0 - 1 #映射到[-1,1]之間y = tf.cast(y, dtype=tf.int32) # 轉換數據類型return x,y# 對所有數據集預處理
train_ds = train_ds.map(processing).shuffle(10000)
val_ds = val_ds.map(processing)
test_ds = test_ds.map(processing)# 再次查看數據信息
sample = next(iter(train_ds)) #每次取出一個batch的訓練數據
print('x_batch.shape:', sample[0].shape, 'y_batch.shape:',sample[1].shape)
# x_batch.shape: (128, 224, 224, 3) y_batch.shape: (128, 4)
print('y[:5]:', sample[1][:5]) # 查看前五個目標值
# [[0 0 1 0], [1 0 0 0], [0 1 0 0], [0 1 0 0], [1 0 0 0]]
鳥類圖像如下:
3. VGG16網絡構造
VGG16的模型框架如下圖所示,原理見下文:深度學習-VGG16原理詳解?。
1)輸入圖像尺寸為224x224x3,經64個通道為3的3x3的卷積核,步長為1,padding=same填充,卷積兩次,再經ReLU激活,輸出的尺寸大小為224x224x64
2)經max pooling(最大化池化),濾波器為2x2,步長為2,圖像尺寸減半,池化后的尺寸變為112x112x64
3)經128個3x3的卷積核,兩次卷積,ReLU激活,尺寸變為112x112x128
4)max pooling池化,尺寸變為56x56x128
5)經256個3x3的卷積核,三次卷積,ReLU激活,尺寸變為56x56x256
6)max pooling池化,尺寸變為28x28x256
7)經512個3x3的卷積核,三次卷積,ReLU激活,尺寸變為28x28x512
8)max pooling池化,尺寸變為14x14x512
9)經512個3x3的卷積核,三次卷積,ReLU,尺寸變為14x14x512
10)max pooling池化,尺寸變為7x7x512
11)然后Flatten(),將數據拉平成向量,變成一維51277=25088。
11)再經過兩層1x1x4096,一層1x1x1000的全連接層(共三層),經ReLU激活
12)最后通過softmax輸出1000個預測結果
?
下面通過代碼來實現,這里我們需要的是4分類,因此把最后的1000個預測結果改為4既可。
#(4)構建CNN-VGG16
def VGG16(input_shape=(224,224,3), output_shape=4):# 輸入層input_tensor = keras.Input(shape=input_shape)# unit1# 卷積層x = layers.Conv2D(64, (3,3), activation='relu', strides=1, padding='same')(input_tensor) # [224,224,64]# 卷積層x = layers.Conv2D(64, (3,3), activation='relu' , strides=1, padding='same')(x) #[224,224,64]# 池化層,size變成1/2x = layers.MaxPool2D(pool_size=(2,2), strides=(2,2))(x) #[112,112,64]# unit2# 卷積層x = layers.Conv2D(128, (3,3), activation='relu', strides=1, padding='same')(x) #[112,112,128]# 卷積層x = layers.Conv2D(128, (3,3), activation='relu', strides=1, padding='same')(x) #[112,112,128]# 池化層x = layers.MaxPool2D(pool_size=(2,2), strides=(2,2))(x) #[56,56,128]# unit3# 卷積層x = layers.Conv2D(256, (3,3), activation='relu', strides=1, padding='same')(x) #[56,56,256]# 卷積層x = layers.Conv2D(256, (3,3), activation='relu', strides=1, padding='same')(x) #[56,56,256]# 卷積層x = layers.Conv2D(256, (3,3), activation='relu', strides=1, padding='same')(x) #[56,56,256]# 池化層x = layers.MaxPool2D(pool_size=(2,2), strides=(2,2))(x) #[28,28,256]# unit4# 卷積層x = layers.Conv2D(512, (3,3), activation='relu', strides=1, padding='same')(x) #[28,28,512]# 卷積層x = layers.Conv2D(512, (3,3), activation='relu', strides=1, padding='same')(x) #[28,28,512] # 卷積層x = layers.Conv2D(512, (3,3), activation='relu', strides=1, padding='same')(x) #[28,28,512]# 池化層x = layers.MaxPool2D(pool_size=(2,2), strides=(2,2))(x) #[14,14,512]# unit5# 卷積層x = layers.Conv2D(512, (3,3), activation='relu', strides=1, padding='same')(x) #[14,14,512]# 卷積層x = layers.Conv2D(512, (3,3), activation='relu', strides=1, padding='same')(x) #[14,14,512]# 卷積層x = layers.Conv2D(512, (3,3), activation='relu', strides=1, padding='same')(x) #[14,14,512]# 池化層x = layers.MaxPool2D(pool_size=(2,2), strides=(2,2))(x) #[7,7,512]# uint6# Flatten層x = layers.Flatten()(x) #壓平[None,4096]# 全連接層x = layers.Dense(4096, activation='relu')(x) #[None,4096]# 全連接層x = layers.Dense(4096, activation='relu')(x) #[None,4096]# 輸出層,輸出結果不做softmaxoutput_tensor = layers.Dense(output_shape)(x) #[None,4]# 構建模型model = Model(inputs=input_tensor, outputs=output_tensor)# 返回模型return model# 構建VGG16模型
model = VGG16()
# 查看模型結構
model.summary()
該網絡構架如下
Model: "model"
_________________________________________________________________Layer (type) Output Shape Param #
=================================================================input_1 (InputLayer) [(None, 224, 224, 3)] 0 conv2d (Conv2D) (None, 224, 224, 64) 1792 conv2d_1 (Conv2D) (None, 224, 224, 64) 36928 max_pooling2d (MaxPooling2D (None, 112, 112, 64) 0 ) conv2d_2 (Conv2D) (None, 112, 112, 128) 73856 conv2d_3 (Conv2D) (None, 112, 112, 128) 147584 max_pooling2d_1 (MaxPooling (None, 56, 56, 128) 0 2D) conv2d_4 (Conv2D) (None, 56, 56, 256) 295168 conv2d_5 (Conv2D) (None, 56, 56, 256) 590080 conv2d_6 (Conv2D) (None, 56, 56, 256) 590080 max_pooling2d_2 (MaxPooling (None, 28, 28, 256) 0 2D) conv2d_7 (Conv2D) (None, 28, 28, 512) 1180160 conv2d_8 (Conv2D) (None, 28, 28, 512) 2359808 conv2d_9 (Conv2D) (None, 28, 28, 512) 2359808 max_pooling2d_3 (MaxPooling (None, 14, 14, 512) 0 2D) conv2d_10 (Conv2D) (None, 14, 14, 512) 2359808 conv2d_11 (Conv2D) (None, 14, 14, 512) 2359808 conv2d_12 (Conv2D) (None, 14, 14, 512) 2359808 max_pooling2d_4 (MaxPooling (None, 7, 7, 512) 0 2D) flatten (Flatten) (None, 25088) 0 dense (Dense) (None, 4096) 102764544 dense_1 (Dense) (None, 4096) 16781312 dense_2 (Dense) (None, 4) 16388 =================================================================
Total params: 134,276,932
Trainable params: 134,276,932
Non-trainable params: 0
_________________________________________________________________
4. 網絡編譯
在網絡編譯時.compile(),指定損失loss采用交叉熵損失,設置參數from_logits=True,由于網絡的輸出層沒有使用softmax函數將輸出的實數轉為概率,參數設置為True時,會自動將logits的實數轉為概率值,再和真實值計算損失,這里的真實值y是經過onehot編碼之后的結果。
#(5)模型配置
# 設置優化器
opt = optimizers.Adam(learning_rate=1e-4) # 學習率model.compile(optimizer=opt, #學習率loss=keras.losses.CategoricalCrossentropy(from_logits=True), #損失metrics=['accuracy']) #評價指標# 訓練,給定訓練集、驗證集
history = model.fit(train_ds, validation_data=val_ds, epochs=30) #迭代30次#(6)循環結束后繪制損失和準確率的曲線
# ==1== 準確率
train_acc = history.history['accuracy'] #訓練集準確率
val_acc = history.history['val_accuracy'] #驗證集準確率
# ==2== 損失
train_loss = history.history['loss'] #訓練集損失
val_loss = history.history['val_loss'] #驗證集損失
# ==3== 繪圖
epochs_range = range(len(train_acc))
plt.figure(figsize=(10,5))
# 準確率
plt.subplot(1,2,1)
plt.plot(epochs_range, train_acc, label='train_acc')
plt.plot(epochs_range, val_acc, label='val_acc')
plt.legend()
# 損失曲線
plt.subplot(1,2,2)
plt.plot(epochs_range, train_loss, label='train_loss')
plt.plot(epochs_range, val_loss, label='val_loss')
plt.legend()
5. 結果展示
如圖可見網絡效果預測較好,在迭代至25次左右時網絡準確率達到99%左右,如果迭代次數較多的話,可考慮在編譯時使用early stopping保存最優權重,若后續網絡效果都沒有提升就可以提早停止網絡,節約訓練時間。
訓練過程中的損失和準確率如下
Epoch 1/30
13/13 [==============================] - 7s 293ms/step - loss: 1.3627 - accuracy: 0.3116 - val_loss: 1.3483 - val_accuracy: 0.5075
Epoch 2/30
13/13 [==============================] - 3s 173ms/step - loss: 1.1267 - accuracy: 0.5251 - val_loss: 1.0235 - val_accuracy: 0.5226
------------------------------------------------------------------------------------------
省略N行
------------------------------------------------------------------------------------------
Epoch 26/30
13/13 [==============================] - 2s 174ms/step - loss: 0.1184 - accuracy: 0.9874 - val_loss: 0.1093 - val_accuracy: 0.9774
Epoch 27/30
13/13 [==============================] - 2s 174ms/step - loss: 0.3208 - accuracy: 0.9196 - val_loss: 0.2678 - val_accuracy: 0.9347
Epoch 28/30
13/13 [==============================] - 2s 172ms/step - loss: 0.2366 - accuracy: 0.9322 - val_loss: 0.1247 - val_accuracy: 0.9648
Epoch 29/30
13/13 [==============================] - 3s 173ms/step - loss: 0.1027 - accuracy: 0.9648 - val_loss: 0.0453 - val_accuracy: 0.9849
Epoch 30/30
13/13 [==============================] - 3s 171ms/step - loss: 0.0491 - accuracy: 0.9849 - val_loss: 0.0250 - val_accuracy: 0.9925
6. 其他方法
如果想更靈活的計算損失和準確率,可以不使用.compile(),.fit()函數。在模型構建完之后,自己敲一下代碼實現前向傳播,同樣能實現模型訓練效果。下面的代碼可以代替第4小節中的第(5)步
# 指定優化器
optimizer = optimizers.Adam(learning_rate=1e-5)
# 記錄訓練和測試過程中的每個batch的準確率和損失
train_acc = []
train_loss = []
val_acc = []
val_loss = []# 大循環
for epochs in range(30): #循環30次train_total_sum=0train_total_loss=0train_total_correct=0val_total_sum=0val_total_loss=0val_total_correct=0 #(5)網絡訓練for step, (x,y) in enumerate(train_ds): #每次從訓練集中取出一個batch# 梯度跟蹤with tf.GradientTape() as tape:# 前向傳播logits = model(x) # 輸出屬于每個分類的實數值# 計算準確率prob = tf.nn.softmax(logits, axis=1) # 計算概率predict = tf.argmax(prob, axis=1, output_type=tf.int32) # 概率最大值的下標correct = tf.cast(tf.equal(predict, y), dtype=tf.int32) # 對比預測值和真實值,將結果從布爾類型轉變為1和0correct = tf.reduce_sum(correct) # 計算一共預測對了幾個total = x.shape[0] # 每次迭代有多少參與進來train_total_sum += total #記錄一共有多少個樣本參與了循環train_total_correct += correct #記錄一整次循環下來預測對了幾個acc = correct/total # 每一個batch的準確率train_acc.append(acc) # 將每一個batch的準確率保存下來# 計算損失y = tf.one_hot(y, depth=4) # 對真實值進行onehot編碼,分為4類loss = tf.losses.categorical_crossentropy(y, logits, from_logits=True) # 將預測值放入softmax種再計算損失# 求每個batch的損失均值loss_avg = tf.reduce_mean(loss, axis=1)# 記錄總損失train_total_loss += tf.reduce_sum(loss) #記錄每個batch的損失 # 梯度計算,因變量為loss損失,自變量為模型中所有的權重和偏置grads = tape.gradient(loss_avg, model.trainable_variables)# 梯度更新,對所有的權重和偏置更新梯度optimizer.apply_gradients(zip(grads, model.trainable_variables))# 每20個batch打印一次損失和準確率if step%20 == 0:print('train', 'step:', step, 'loss:', loss_avg, 'acc:', acc)# 記錄每次循環的損失和準確率train_acc.append(train_total_correct/train_total_sum) # 總預測對的個數除以總個數,平均準確率train_loss.append(train_total_loss/train_total_sum) # 總損失處于總個數,得平均損失#(6)網絡測試for step, (x, y) in enumerate(val_ds): #每次取出一個batch的驗證數據# 前向傳播logits = model(x)# 計算準確率prob = tf.nn.softmax(logits, axis=1) # 計算概率predict = tf.argmax(prob, axis=1, output_type=tf.int32) # 概率最大值的下標correct = tf.cast(tf.equal(predict, y), dtype=tf.int32) # 對比預測值和真實值,將結果從布爾類型轉變為1和0correct = tf.reduce_sum(correct) # 計算一共預測對了幾個val_total_correct += correct # 計算整個循環預測對了幾個total = x.shape[0] # 每次迭代有多少參與進來val_total_sum += total # 整個循環有多少參與進來acc = correct/total # 每一個batch的準確率val_acc.append(acc) # 將每一個batch的準確率保存下來# 計算損失y = tf.one_hot(y, depth=4) # 對真實值進行onehot編碼,分為4類loss = tf.losses.categorical_crossentropy(y, logits, from_logits=True) # 將預測值放入softmax種再計算損失# 求每個batch的損失均值loss_avg = tf.reduce_mean(loss, axis=1)# 記錄總損失val_total_loss += tf.reduce_sum(loss) # 每10個btch打印一次準確率和損失if step%10 == 0:print('val', 'step:', step, 'loss:', loss_avg, 'acc:', acc)# 記錄每次循環的損失和準確率val_acc.append(val_total_correct/val_total_sum) # 總預測對的個數除以總個數,平均準確率val_loss.append(val_total_loss/val_total_sum) # 總損失處于總個數,得平均損失
總結
以上是生活随笔為你收集整理的【神经网络】(6) 卷积神经网络(VGG16),案例:鸟类图片4分类的全部內容,希望文章能夠幫你解決所遇到的問題。
- 上一篇: 【神经网络】(4) 卷积神经网络(CNN
- 下一篇: 【神经网络】(7) 迁移学习(CNN-M