本文結合Deep learning的一個應用,Convolution Neural Network 進行一些基本應用,參考Lecun的Document 0.1進行部分拓展,與結果展示(in python)。
分為以下幾部分:
1. Convolution(卷積)
2. Pooling(降采樣過程)
3. CNN結構
4. ?跑實驗
下面分別介紹。
PS:本篇blog為ese機器學習短期班參考資料(20140516課程),本文只是簡要講最naive最simple的思想,重在實踐部分,原理課上詳述。
1. Convolution(卷積)
類似于高斯卷積,對imagebatch中的所有image進行卷積。對于一張圖,其所有feature map用一個filter卷成一張feature map。 如下面的代碼,對一個imagebatch(含兩張圖)進行操作,每個圖初始有3張feature map(R,G,B), 用兩個9*9的filter進行卷積,結果是,每張圖得到兩個feature map。
卷積操作由theano的conv.conv2d實現,這里我們用隨機參數W,b。結果有點像edge detector是不是?
Code: (詳見注釋)
[python] ?view plain
?copy ? ?? ? ? ? ? ? ? ? ? ?? ?? from ?theano.tensor.nnet? import ?conv?? import ?theano.tensor?as?T?? import ?numpy,?theano?? ?? ?? rng?=?numpy.random.RandomState(23455 )?? ?? ?? input?=?T.tensor4(name?=?'input' )?? ?? ?? w_shape?=?(2 , 3 , 9 , 9 )? ?? w_bound?=?numpy.sqrt(3 * 9 * 9 )?? W?=?theano.shared(numpy.asarray(rng.uniform(low?=?-1.0 /w_bound,?high?=? 1.0 /w_bound,size?=?w_shape),?? ????????????????????????????????dtype?=?input.dtype),name?=?'W' )?? ?? b_shape?=?(2 ,)?? b?=?theano.shared(numpy.asarray(rng.uniform(low?=?-.5 ,?high?=?. 5 ,?size?=?b_shape),?? ????????????????????????????????dtype?=?input.dtype),name?=?'b' )?? ?????????????????????????????????? conv_out?=?conv.conv2d(input,W)?? ?? ?? ?? ?? ?? ?? output?=?T.nnet.sigmoid(conv_out?+?b.dimshuffle('x' , 0 , 'x' , 'x' ))?? f?=?theano.function([input],output)?? ?? ?? ?? ?? ?? ?? import ?pylab?? from ?PIL? import ?Image?? ?? ?? ?? img1?=?Image.open(open('//home//rachel//Documents//ZJU_Projects//DL//Dataset//rachel.jpg' ))?? width1,height1?=?img1.size?? img1?=?numpy.asarray(img1,?dtype?=?'float32' )/ 256. ? ?? ?? ?? img1_rgb?=?img1.swapaxes(0 , 2 ).swapaxes( 1 , 2 ).reshape( 1 , 3 ,height1,width1)? ?? ?? ?? ?? img2?=?Image.open(open('//home//rachel//Documents//ZJU_Projects//DL//Dataset//rachel1.jpg' ))?? width2,height2?=?img2.size?? img2?=?numpy.asarray(img2,dtype?=?'float32' )/ 256. ?? img2_rgb?=?img2.swapaxes(0 , 2 ).swapaxes( 1 , 2 ).reshape( 1 , 3 ,height2,width2)? ?? ?? ?? ?? ?? minibatch_img?=?numpy.concatenate((img1_rgb,img2_rgb),axis?=?0 )?? filtered_img?=?f(minibatch_img)?? ?? ?? ?? pylab.subplot(2 , 3 , 1 );pylab.axis( 'off' );?? pylab.imshow(img1)?? ?? pylab.subplot(2 , 3 , 4 );pylab.axis( 'off' );?? pylab.imshow(img2)?? ?? pylab.gray()?? pylab.subplot(2 , 3 , 2 );?pylab.axis( "off" )?? pylab.imshow(filtered_img[0 , 0 ,:,:])? ?? ?? pylab.subplot(2 , 3 , 3 );?pylab.axis( "off" )?? pylab.imshow(filtered_img[0 , 1 ,:,:])? ?? ?? pylab.subplot(2 , 3 , 5 );?pylab.axis( "off" )?? pylab.imshow(filtered_img[1 , 0 ,:,:])? ?? ?? pylab.subplot(2 , 3 , 6 );?pylab.axis( "off" )?? pylab.imshow(filtered_img[1 , 1 ,:,:])? ?? pylab.show()??
2. Pooling(降采樣過程)
最常用的Maxpooling. 解決了兩個問題:
1. 減少計算量
2. 旋轉不變性 (原因自己悟)
???? PS:對于旋轉不變性,回憶下SIFT,LBP:采用主方向;HOG:選擇不同方向的模版
Maxpooling的降采樣過程會將feature map的長寬各減半。(下面結果圖中沒有體現出來,python自動給拉到一樣大了,但實際上像素數是減半的)
Code: (詳見注釋)
[python] ?view plain
?copy ? ?? ? ? ? ? ? ? ? ? ?? ?? from ?theano.tensor.nnet? import ?conv?? import ?theano.tensor?as?T?? import ?numpy,?theano?? ?? ?? rng?=?numpy.random.RandomState(23455 )?? ?? ?? input?=?T.tensor4(name?=?'input' )?? ?? ?? w_shape?=?(2 , 3 , 9 , 9 )? ?? w_bound?=?numpy.sqrt(3 * 9 * 9 )?? W?=?theano.shared(numpy.asarray(rng.uniform(low?=?-1.0 /w_bound,?high?=? 1.0 /w_bound,size?=?w_shape),?? ????????????????????????????????dtype?=?input.dtype),name?=?'W' )?? ?? b_shape?=?(2 ,)?? b?=?theano.shared(numpy.asarray(rng.uniform(low?=?-.5 ,?high?=?. 5 ,?size?=?b_shape),?? ????????????????????????????????dtype?=?input.dtype),name?=?'b' )?? ?????????????????????????????????? conv_out?=?conv.conv2d(input,W)?? ?? ?? ?? ?? ?? ?? output?=?T.nnet.sigmoid(conv_out?+?b.dimshuffle('x' , 0 , 'x' , 'x' ))?? f?=?theano.function([input],output)?? ?? ?? ?? ?? ?? ?? import ?pylab?? from ?PIL? import ?Image?? from ?matplotlib.pyplot? import ?*?? ?? ?? img?=?Image.open(open('//home//rachel//Documents//ZJU_Projects//DL//Dataset//rachel.jpg' ))?? width,height?=?img.size?? img?=?numpy.asarray(img,?dtype?=?'float32' )/ 256. ? ?? ?? ?? ?? img_rgb?=?img.swapaxes(0 , 2 ).swapaxes( 1 , 2 )? ?? minibatch_img?=?img_rgb.reshape(1 , 3 ,height,width)?? filtered_img?=?f(minibatch_img)?? ?? ?? ?? pylab.figure(1 )?? pylab.subplot(1 , 3 , 1 );pylab.axis( 'off' );?? pylab.imshow(img)?? title('origin?image' )?? ?? pylab.gray()?? pylab.subplot(2 , 3 , 2 );?pylab.axis( "off" )?? pylab.imshow(filtered_img[0 , 0 ,:,:])? ?? title('convolution?1' )?? ?? pylab.subplot(2 , 3 , 3 );?pylab.axis( "off" )?? pylab.imshow(filtered_img[0 , 1 ,:,:])? ?? title('convolution?2' )?? ?? ?? ?? ?? ?? ?? ?? from ?theano.tensor.signal? import ?downsample?? ?? input?=?T.tensor4('input' )?? maxpool_shape?=?(2 , 2 )?? pooled_img?=?downsample.max_pool_2d(input,maxpool_shape,ignore_border?=?False )?? ?? maxpool?=?theano.function(inputs?=?[input],?? ??????????????????????????outputs?=?[pooled_img])?? ?? pooled_res?=?numpy.squeeze(maxpool(filtered_img))???????????????? ?? pylab.subplot(235 );pylab.axis( 'off' );?? pylab.imshow(pooled_res[0 ,:,:])?? title('down?sampled?1' )?? ?? pylab.subplot(236 );pylab.axis( 'off' );?? pylab.imshow(pooled_res[1 ,:,:])?? title('down?sampled?2' )?? ?? pylab.show()??
3. CNN結構
想必大家隨便google下CNN的圖都濫大街了,這里拖出來那時候學CNN的時候一張圖,自認為陪上講解的話畫得還易懂(<!--囧-->)
廢話不多說了,直接上Lenet結構圖:(從下往上順著箭頭看,最下面為底層original input)
4. CNN代碼
去資源里下載吧,我放上去了喔~(in python)
這里貼少部分代碼,僅表示建模的NN: [python] ?view plain
?copy ? rng?=?numpy.random.RandomState( 23455 )?? ?? ?????? ?????? ????N_filters_0?=?20 ?? ????D_features_0=?1 ?? ????layer0_input?=?x.reshape((batch_size,D_features_0,28 , 28 ))?? ????layer0?=?LeNetConvPoolLayer(rng,?input?=?layer0_input,?filter_shape?=?(N_filters_0,D_features_0,5 , 5 ),?? ????????????????????????????????image_shape?=?(batch_size,1 , 28 , 28 ))?? ?????? ?????? ????N_filters_1?=?50 ?? ????D_features_1?=?N_filters_0?? ????layer1?=?LeNetConvPoolLayer(rng,input?=?layer0.output,?filter_shape?=?(N_filters_1,D_features_1,5 , 5 ),?? ????????????????????????????????image_shape?=?(batch_size,N_filters_0,12 , 12 ))?? ?????? ?????? ????layer2_input?=?layer1.output.flatten(2 )? ?? ????layer2?=?HiddenLayer(rng,layer2_input,n_in?=?50 * 4 * 4 ,n_out?=? 500 ,?activation?=?T.tanh)?? ?????? ????layer3?=?LogisticRegression(input?=?layer2.output,?n_in?=?500 ,?n_out?=? 10 )?? layer0, layer1 :分別是卷積+降采樣
layer2+layer3:組成一個MLP(ANN)
訓練模型:
[python] ?view plain
?copy ? cost?=?layer3.negative_log_likelihood(y)?? params?=?layer3.params?+?layer2.params?+?layer1.params?+?layer0.params?? gparams?=?T.grad(cost,params)?? ?? updates?=?[]?? for ?par,gpar? in ?zip(params,gparams):?? ????updates.append((par,?par?-?learning_rate?*?gpar))?? ?? train_model?=?theano.function(inputs?=?[minibatch_index],?? ??????????????????????????????outputs?=?[cost],?? ??????????????????????????????updates?=?updates,?? ??????????????????????????????givens?=?{x:?train_set_x[minibatch_index?*?batch_size?:?(minibatch_index+1 )?*?batch_size],?? ????????????????????????????????????????y:?train_set_y[minibatch_index?*?batch_size?:?(minibatch_index+1 )?*?batch_size]})?? 根據cost(最上層MLP的輸出NLL),對所有層的parameters進行訓練
剩下的具體見代碼和注釋。
PS:數據為MNIST所有數據
final result: Optimization complete. Best validation score of 0.990000 % obtained at iteration 122500, with test performance 0.950000 %
from:?http://blog.csdn.net/abcjennifer/article/details/25912675
總結
以上是生活随笔 為你收集整理的卷积神经网络Convolution Neural Network (CNN) 原理与实现 的全部內容,希望文章能夠幫你解決所遇到的問題。
如果覺得生活随笔 網站內容還不錯,歡迎將生活随笔 推薦給好友。