VGG-16网络结构
一、VGG-16網(wǎng)絡(luò)框架介紹
VGGNet是牛津大學(xué)計(jì)算機(jī)視覺(jué)組(Visual Geometry Group)和Google DeepMind公司的研究員一起研發(fā)的深度卷積神經(jīng)網(wǎng)絡(luò)。
VGGNet探索了卷積神經(jīng)網(wǎng)絡(luò)的深度與其性能之間的關(guān)系,通過(guò)反復(fù)堆疊33的小型卷積核和22的最大池化層,
VGGNet成功地構(gòu)筑了16~19層深的卷積神經(jīng)網(wǎng)絡(luò)。VGGNet相比之前state-of-the-art的網(wǎng)絡(luò)結(jié)構(gòu),錯(cuò)誤率大幅下降,
VGGNet論文中全部使用了33的小型卷積核和22的最大池化核,通過(guò)不斷加深網(wǎng)絡(luò)結(jié)構(gòu)來(lái)提升性能。
VGG-16和VGG-19結(jié)構(gòu)如下:
總結(jié):
VGG-16網(wǎng)絡(luò)中的16代表的含義為:含有參數(shù)的有16個(gè)層,共包含參數(shù)約為1.38億。
VGG-16網(wǎng)絡(luò)結(jié)構(gòu)很規(guī)整,沒(méi)有那么多的超參數(shù),專注于構(gòu)建簡(jiǎn)單的網(wǎng)絡(luò),都是幾個(gè)卷積層后面跟一個(gè)可以壓縮圖像大小的池化層。即:全部使用33的小型卷積核和22的最大池化層。
卷積層:CONV=3*3 filters, s = 1, padding = same convolution。
池化層:MAX_POOL = 2*2 , s = 2。
優(yōu)點(diǎn):簡(jiǎn)化了卷積神經(jīng)網(wǎng)絡(luò)的結(jié)構(gòu);缺點(diǎn):訓(xùn)練的特征數(shù)量非常大。
隨著網(wǎng)絡(luò)加深,圖像的寬度和高度都在以一定的規(guī)律不斷減小,每次池化后剛好縮小一半,信道數(shù)目不斷增加一倍。
二、整體架構(gòu)代碼實(shí)現(xiàn)
用VGG_6.py文件實(shí)現(xiàn)前向傳播過(guò)程以及網(wǎng)絡(luò)的參數(shù):
#!/usr/bin/env python # -*- coding:utf-8 -*- # Author:ZhengzhengLiuimport tensorflow as tf#VGG_16全部使用3*3卷積核和2*2的池化核#創(chuàng)建卷積層函數(shù) def conv_op(input_op,name,kh,kw,n_out,dh,dw,p):"""param :input_op -- 輸入tensorname -- 該層的名稱kh -- 卷積核的高kw -- 卷積核的寬n_out -- 卷積核數(shù)目/輸出通道數(shù)dh -- 步長(zhǎng)的高dw -- 步長(zhǎng)的寬p -- 參數(shù)(字典類型)return:A -- 卷積層的輸出"""n_in = input_op.get_shape()[-1].value #輸入的通道數(shù)with tf.variable_scope(name) as scope:weights = tf.get_variable(name=scope+"w",shape=[kh,kw,n_in,n_out],dtype=tf.float32,initializer=tf.contrib.layers.xavier_initializer_con2d())biases = tf.get_variable(name=scope+"b",shape=[n_out],dtype=tf.float32,initializer=tf.constant_initializer(0.0),trainable=True)conv = tf.nn.conv2d(input=input_op,filter=weights,strides=[1,dh,dw,1],padding="SAME")Z = tf.nn.bias_add(conv,biases)A = tf.nn.relu(Z,name=scope)p[name+"w"] = weightsp[name+"b"] = biasesreturn A#創(chuàng)建最大池化層的函數(shù) def maxpool_op(input_op,name,kh,kw,dh,dw):"""param :input_op -- 輸入tensorname -- 該層的名稱kh -- 池化核的高kw -- 池化核的寬dh -- 步長(zhǎng)的高dw -- 步長(zhǎng)的寬return:pool -- 該層的池化的object"""pool = tf.nn.max_pool(input_op,ksize=[1,kh,kw,1],strides=[1,dh,dw,1],padding="SAME",name=name)return pool#創(chuàng)建全連接層的函數(shù) def fc_op(input_op,name,n_out,p):"""param :input_op -- 輸入tensorname -- 該層的名稱n_out -- 輸出通道數(shù)p -- 參數(shù)字典return:A -- 全連接層最后的輸出"""n_in = input_op.get_shape()[-1].valuewith tf.variable_scope(name) as scope:weights = tf.get_variable(name=scope+"w",shape=[n_in,n_out],dtype=tf.float32,initializer=tf.contrib.layers.xavier_initializer())# biases不再初始化為0,賦予一個(gè)較小的值,以避免dead neuronbiases = tf.get_variable(name=scope+"b",shape=[n_out],dtype=tf.float32,initializer=tf.constant_initializer(0.1))#tf.nn.relu_layer對(duì)輸入變量input_op與weights做矩陣乘法加上biases,再做非線性relu變換A = tf.nn.relu_layer(input_op,weights,biases,name=scope)p[name + "w"] = weightsp[name + "b"] = biasesreturn A#構(gòu)建VGG_16網(wǎng)絡(luò)的框架 def VGG_16(input_op,keep_prob):"""param :input_op -- 輸入tensorkeep_prob -- 控制dropout比率的占位符return:fc8 -- 最后一層全連接層softmax -- softmax分類prediction -- 預(yù)測(cè)p -- 參數(shù)字典"""p = {} #初始化參數(shù)字典#第一段卷積網(wǎng)絡(luò)——兩個(gè)卷積層和一個(gè)最大池化層#兩個(gè)卷積層的卷積核大小為3*3,卷積核數(shù)量均為64,步長(zhǎng)s=1,輸出均為:224*224*64conv1_1 = conv_op(input_op,name="conv1_1",kh=3,kw=3,n_out=64,dh=1,dw=1,p=p)conv1_2 = conv_op(conv1_1,name="conv1_2",kh=3,kw=3,n_out=64,dh=1,dw=1,p=p)#最大池化層采用的尺寸大小為:2*2,步長(zhǎng)s=2,輸出為:112*112*64pool1 = maxpool_op(conv1_2,name="pool1",kh=2,kw=2,dh=2,dw=2)# 第二段卷積網(wǎng)絡(luò)——兩個(gè)卷積層和一個(gè)最大池化層# 兩個(gè)卷積層的卷積核大小為3*3,卷積核數(shù)量均為128,步長(zhǎng)s=1,輸出均為:112*112*128conv2_1 = conv_op(pool1,name="conv2_1",kh=3,kw=3,n_out=128,dh=1,dw=1,p=p)conv2_2 = conv_op(conv2_1,name="conv2_2",kh=3,kw=3,n_out=128,dh=1,dw=1,p=p)# 最大池化層采用的尺寸大小為:2*2,步長(zhǎng)s=2,輸出為:56*56*128pool2 = maxpool_op(conv2_2,name="pool2",kh=2,kw=2,dh=2,dw=2)# 第三段卷積網(wǎng)絡(luò)——三個(gè)卷積層和一個(gè)最大池化層# 三個(gè)卷積層的卷積核大小為3*3,卷積核數(shù)量均為256,步長(zhǎng)s=1,輸出均為:56*56*256conv3_1 = conv_op(pool2,name="conv3_1",kh=3,kw=3,n_out=256,dh=1,dw=1,p=p)conv3_2 = conv_op(conv3_1,name="conv3_2",kh=3,kw=3,n_out=256,dh=1,dw=1,p=p)conv3_3 = conv_op(conv3_2,name="conv3_3",kh=3,kw=3,n_out=256,dh=1,dw=1,p=p)# 最大池化層采用的尺寸大小為:2*2,步長(zhǎng)s=2,輸出為:28*28*256pool3 = maxpool_op(conv3_3,name="pool3",kh=2,kw=2,dh=2,dw=2)# 第四段卷積網(wǎng)絡(luò)——三個(gè)卷積層和一個(gè)最大池化層# 三個(gè)卷積層的卷積核大小為3*3,卷積核數(shù)量均為512,步長(zhǎng)s=1,輸出均為:28*28*512conv4_1 = conv_op(pool3, name="conv4_1", kh=3, kw=3, n_out=512, dh=1, dw=1, p=p)conv4_2 = conv_op(conv4_1, name="conv4_2", kh=3, kw=3, n_out=512, dh=1, dw=1, p=p)conv4_3 = conv_op(conv4_2, name="conv4_3", kh=3, kw=3, n_out=512, dh=1, dw=1, p=p)# 最大池化層采用的尺寸大小為:2*2,步長(zhǎng)s=2,輸出為:14*14*512pool4 = maxpool_op(conv4_3, name="pool4", kh=2, kw=2, dh=2, dw=2)# 第五段卷積網(wǎng)絡(luò)——三個(gè)卷積層和一個(gè)最大池化層# 三個(gè)卷積層的卷積核大小為3*3,卷積核數(shù)量均為512,步長(zhǎng)s=1,輸出均為:14*14*512conv5_1 = conv_op(pool4, name="conv5_1", kh=3, kw=3, n_out=512, dh=1, dw=1, p=p)conv5_2 = conv_op(conv5_1, name="conv5_2", kh=3, kw=3, n_out=512, dh=1, dw=1, p=p)conv5_3 = conv_op(conv5_2, name="conv5_3", kh=3, kw=3, n_out=512, dh=1, dw=1, p=p)# 最大池化層采用的尺寸大小為:2*2,步長(zhǎng)s=2,輸出為:7*7*512pool5 = maxpool_op(conv5_3, name="pool5", kh=2, kw=2, dh=2, dw=2)# 第六、七段 —— 含4096個(gè)隱藏節(jié)點(diǎn)的全連接層及dropoutpool5_shape = pool5.get_shape().as_list()flattened_shape = pool5_shape[1] * pool5_shape[2] * pool5_shape[3]dense = tf.reshape(pool5, shape=[-1, flattened_shape],name="dense") # 向量化fc6 = fc_op(dense,name="fc6",n_out=4096,p=p)fc6_drop = tf.nn.dropout(fc6,keep_prob=keep_prob,name="fc6_drop")fc7 = fc_op(fc6_drop, name="fc7", n_out=4096, p=p)fc7_drop = tf.nn.dropout(fc7, keep_prob=keep_prob, name="fc7_drop")#最后一層輸出層含1000個(gè)節(jié)點(diǎn),進(jìn)行softmax分類fc8 = fc_op(fc7_drop,name="fc8",n_out=1000,p=p)softmax = tf.nn.softmax(fc8)prediction = tf.argmax(softmax,1)return prediction,softmax,fc8,psamsan三、用slim實(shí)現(xiàn)VGG_16網(wǎng)絡(luò)的代碼如下:
def vgg16(inputs):with slim.arg_scope([slim.conv2d, slim.fully_connected],activation_fn=tf.nn.relu,weights_initializer=tf.truncated_normal_initializer(0.0, 0.01),weights_regularizer=slim.l2_regularizer(0.0005)):net = slim.repeat(inputs, 2, slim.conv2d, 64, [3, 3], scope='conv1')net = slim.max_pool2d(net, [2, 2], scope='pool1')net = slim.repeat(net, 2, slim.conv2d, 128, [3, 3], scope='conv2')net = slim.max_pool2d(net, [2, 2], scope='pool2')net = slim.repeat(net, 3, slim.conv2d, 256, [3, 3], scope='conv3')net = slim.max_pool2d(net, [2, 2], scope='pool3')net = slim.repeat(net, 3, slim.conv2d, 512, [3, 3], scope='conv4')net = slim.max_pool2d(net, [2, 2], scope='pool4')net = slim.repeat(net, 3, slim.conv2d, 512, [3, 3], scope='conv5')net = slim.max_pool2d(net, [2, 2], scope='pool5')net = slim.fully_connected(net, 4096, scope='fc6')net = slim.dropout(net, 0.5, scope='dropout6')net = slim.fully_connected(net, 4096, scope='fc7')net = slim.dropout(net, 0.5, scope='dropout7')net = slim.fully_connected(net, 1000, activation_fn=None, scope='fc8')return net轉(zhuǎn)載鏈接
總結(jié)
以上是生活随笔為你收集整理的VGG-16网络结构的全部?jī)?nèi)容,希望文章能夠幫你解決所遇到的問(wèn)題。
- 上一篇: 天勤数据结构——绪论
- 下一篇: Live 2D所有模型展示图