生活随笔
收集整理的這篇文章主要介紹了
DeepLearning tutorial(4)CNN卷积神经网络原理简介+代码详解
小編覺得挺不錯的,現(xiàn)在分享給大家,幫大家做個參考.
FROM:?http://blog.csdn.net/u012162613/article/details/43225445
DeepLearning tutorial(4)CNN卷積神經網絡原理簡介+代碼詳解
@author:wepon
@blog:http://blog.csdn.net/u012162613/article/details/43225445
本文介紹多層感知機算法,特別是詳細解讀其代碼實現(xiàn),基于python theano,代碼來自:Convolutional Neural Networks (LeNet)。經詳細注釋的代碼和原始代碼:放在我的github地址上,可下載。
一、CNN卷積神經網絡原理簡介
要講明白卷積神經網絡,估計得長篇大論,網上有很多博文已經寫得很好了,所以本文就不重復了,如果你了解CNN,那可以往下看,本文主要是詳細地解讀CNN的實現(xiàn)代碼。如果你沒學習過CNN,在此推薦周曉藝師兄的博文:Deep Learning(深度學習)學習筆記整理系列之(七),以及UFLDL上的卷積特征提取、池化
CNN的最大特點就是稀疏連接(局部感受)和權值共享,如下面兩圖所示,左為稀疏連接,右為權值共享。稀疏連接和權值共享可以減少所要訓練的參數(shù),減少計算復雜度。
? ? ? ?
至于CNN的結構,以經典的LeNet5來說明:
這個圖真是無處不在,一談CNN,必說LeNet5,這圖來自于這篇論文:Gradient-Based Learning Applied to Document Recognition,論文很長,第7頁那里開始講LeNet5這個結構,建議看看那部分。
我這里簡單說一下,LeNet5這張圖從左到右,先是input,這是輸入層,即輸入的圖片。input-layer到C1這部分就是一個卷積層(convolution運算),C1到S2是一個子采樣層(pooling運算),關于卷積和子采樣的具體過程可以參考下圖:
然后,S2到C3又是卷積,C3到S4又是子采樣,可以發(fā)現(xiàn),卷積和子采樣都是成對出現(xiàn)的,卷積后面一般跟著子采樣。S4到C5之間是全連接的,這就相當于一個MLP的隱含層了(如果你不清楚MLP,參考《DeepLearning tutorial(3)MLP多層感知機原理簡介+代碼詳解》)。C5到F6同樣是全連接,也是相當于一個MLP的隱含層。最后從F6到輸出output,其實就是一個分類器,這一層就叫分類層。
ok,CNN的基本結構大概就是這樣,由輸入、卷積層、子采樣層、全連接層、分類層、輸出這些基本“構件”組成,一般根據(jù)具體的應用或者問題,去確定要多少卷積層和子采樣層、采用什么分類器。當確定好了結構以后,如何求解層與層之間的連接參數(shù)?一般采用向前傳播(FP)+向后傳播(BP)的方法來訓練。具體可參考上面給出的鏈接。
二、CNN卷積神經網絡代碼詳細解讀(基于python+theano)
代碼來自于深度學習教程:Convolutional Neural Networks (LeNet),這個代碼實現(xiàn)的是一個簡化了的LeNet5,具體如下:
- 沒有實現(xiàn)location-specific gain and bias parameters
- 用的是maxpooling,而不是average_pooling
- 分類器用的是softmax,LeNet5用的是rbf
- LeNet5第二層并不是全連接的,本程序實現(xiàn)的是全連接
另外,代碼里將卷積層和子采用層合在一起,定義為“LeNetConvPoolLayer“(卷積采樣層),這好理解,因為它們總是成對出現(xiàn)。但是有個地方需要注意,代碼中將卷積后的輸出直接作為子采樣層的輸入,而沒有加偏置b再通過sigmoid函數(shù)進行映射,即沒有了下圖中fx后面的bx以及sigmoid映射,也即直接由fx得到Cx。
最后,代碼中第一個卷積層用的卷積核有20個,第二個卷積層用50個,而不是上面那張LeNet5圖中所示的6個和16個。
了解了這些,下面看代碼:
(1)導入必要的模塊
[python] view plaincopy
import?cPickle??import?gzip??import?os??import?sys??import?time????import?numpy????import?theano??import?theano.tensor?as?T??from?theano.tensor.signal?import?downsample??from?theano.tensor.nnet?import?conv??
(2)定義CNN的基本"構件"
CNN的基本構件包括卷積采樣層、隱含層、分類器,如下
- 定義LeNetConvPoolLayer(卷積+采樣層)
見代碼注釋:
[python] view plaincopy
?????????class?LeNetConvPoolLayer(object):??????def?__init__(self,?rng,?input,?filter_shape,?image_shape,?poolsize=(2,?2)):??????????????????assert?image_shape[1]?==?filter_shape[1]??????????self.input?=?input????????????????fan_in?=?numpy.prod(filter_shape[1:])??????????????fan_out?=?(filter_shape[0]?*?numpy.prod(filter_shape[2:])?/?????????????????????numpy.prod(poolsize))?????????????????????????????????W_bound?=?numpy.sqrt(6.?/?(fan_in?+?fan_out))??????????self.W?=?theano.shared(??????????????numpy.asarray(??????????????????rng.uniform(low=-W_bound,?high=W_bound,?size=filter_shape),??????????????????dtype=theano.config.floatX??????????????),??????????????borrow=True??????????)??????????????????b_values?=?numpy.zeros((filter_shape[0],),?dtype=theano.config.floatX)??????????self.b?=?theano.shared(value=b_values,?borrow=True)????????????????conv_out?=?conv.conv2d(??????????????input=input,??????????????filters=self.W,??????????????filter_shape=filter_shape,??????????????image_shape=image_shape??????????)??????????????pooled_out?=?downsample.max_pool_2d(??????????????input=conv_out,??????????????ds=poolsize,??????????????ignore_border=True??????????)??????????????????self.output?=?T.tanh(pooled_out?+?self.b.dimshuffle('x',?0,?'x',?'x'))????????????self.params?=?[self.W,?self.b]??
這個跟上一篇文章《?DeepLearning tutorial(3)MLP多層感知機原理簡介+代碼詳解》中的HiddenLayer是一致的,直接拿過來:
[python] view plaincopy
???????????class?HiddenLayer(object):??????def?__init__(self,?rng,?input,?n_in,?n_out,?W=None,?b=None,???????????????????activation=T.tanh):???????????self.input?=?input??????????????????????????????????????????????????????if?W?is?None:??????????????W_values?=?numpy.asarray(??????????????????rng.uniform(??????????????????????low=-numpy.sqrt(6.?/?(n_in?+?n_out)),??????????????????????high=numpy.sqrt(6.?/?(n_in?+?n_out)),??????????????????????size=(n_in,?n_out)??????????????????),??????????????????dtype=theano.config.floatX??????????????)??????????????if?activation?==?theano.tensor.nnet.sigmoid:??????????????????W_values?*=?4??????????????W?=?theano.shared(value=W_values,?name='W',?borrow=True)?????????????if?b?is?None:??????????????b_values?=?numpy.zeros((n_out,),?dtype=theano.config.floatX)??????????????b?=?theano.shared(value=b_values,?name='b',?borrow=True)????????????????????????self.W?=?W???????????self.b?=?b???????????????????????lin_output?=?T.dot(input,?self.W)?+?self.b???????????self.output?=?(??????????????lin_output?if?activation?is?None??????????????else?activation(lin_output)???????????)???????????????????????self.params?=?[self.W,?self.b]??
采用Softmax,這跟《DeepLearning tutorial(1)Softmax回歸原理簡介+代碼詳解》中的LogisticRegression是一樣的,直接拿過來:
[python] view plaincopy
????????????????class?LogisticRegression(object):??????def?__init__(self,?input,?n_in,?n_out):??????????????self.W?=?theano.shared(??????????????value=numpy.zeros(??????????????????(n_in,?n_out),??????????????????dtype=theano.config.floatX??????????????),??????????????name='W',??????????????borrow=True??????????)????????????self.b?=?theano.shared(??????????????value=numpy.zeros(??????????????????(n_out,),??????????????????dtype=theano.config.floatX??????????????),??????????????name='b',??????????????borrow=True??????????)??????????????????????self.p_y_given_x?=?T.nnet.softmax(T.dot(input,?self.W)?+?self.b)??????????????self.y_pred?=?T.argmax(self.p_y_given_x,?axis=1)??????????????self.params?=?[self.W,?self.b]??
到這里,CNN的基本”構件“都有了,下面要用這些”構件“組裝成LeNet5(當然,是簡化的,上面已經說了),具體來說,就是組裝成:LeNet5=input+LeNetConvPoolLayer_1+LeNetConvPoolLayer_2+HiddenLayer+LogisticRegression+output。
然后將其應用于MNIST數(shù)據(jù)集,用BP算法去解這個模型,得到最優(yōu)的參數(shù)。
(3)加載MNIST數(shù)據(jù)集(mnist.pkl.gz)
[python] view plaincopy
????def?load_data(dataset):??????????????????data_dir,?data_file?=?os.path.split(dataset)??????if?data_dir?==?""?and?not?os.path.isfile(dataset):????????????????????new_path?=?os.path.join(??????????????os.path.split(__file__)[0],??????????????"..",??????????????"data",??????????????dataset??????????)??????????if?os.path.isfile(new_path)?or?data_file?==?'mnist.pkl.gz':??????????????dataset?=?new_path????????if?(not?os.path.isfile(dataset))?and?data_file?==?'mnist.pkl.gz':??????????import?urllib??????????origin?=?(??????????????'http://www.iro.umontreal.ca/~lisa/deep/data/mnist/mnist.pkl.gz'??????????)??????????print?'Downloading?data?from?%s'?%?origin??????????urllib.urlretrieve(origin,?dataset)????????print?'...?loading?data'????????????????????f?=?gzip.open(dataset,?'rb')??????train_set,?valid_set,?test_set?=?cPickle.load(f)??????f.close()?????????????????def?shared_dataset(data_xy,?borrow=True):??????????data_x,?data_y?=?data_xy??????????shared_x?=?theano.shared(numpy.asarray(data_x,?????????????????????????????????????????????????dtype=theano.config.floatX),???????????????????????????????????borrow=borrow)??????????shared_y?=?theano.shared(numpy.asarray(data_y,?????????????????????????????????????????????????dtype=theano.config.floatX),???????????????????????????????????borrow=borrow)??????????return?shared_x,?T.cast(shared_y,?'int32')??????????test_set_x,?test_set_y?=?shared_dataset(test_set)??????valid_set_x,?valid_set_y?=?shared_dataset(valid_set)??????train_set_x,?train_set_y?=?shared_dataset(train_set)????????rval?=?[(train_set_x,?train_set_y),?(valid_set_x,?valid_set_y),??????????????(test_set_x,?test_set_y)]??????return?rval??
(4)實現(xiàn)LeNet5并測試
[python] view plaincopy
?????def?evaluate_lenet5(learning_rate=0.1,?n_epochs=200,??????????????????????dataset='mnist.pkl.gz',??????????????????????nkerns=[20,?50],?batch_size=500):????????????????????rng?=?numpy.random.RandomState(23455)??????????????datasets?=?load_data(dataset)??????train_set_x,?train_set_y?=?datasets[0]??????valid_set_x,?valid_set_y?=?datasets[1]??????test_set_x,?test_set_y?=?datasets[2]??????????????n_train_batches?=?train_set_x.get_value(borrow=True).shape[0]??????n_valid_batches?=?valid_set_x.get_value(borrow=True).shape[0]??????n_test_batches?=?test_set_x.get_value(borrow=True).shape[0]??????n_train_batches?/=?batch_size??????n_valid_batches?/=?batch_size??????n_test_batches?/=?batch_size??????????????index?=?T.lscalar()????????x?=?T.matrix('x')?????????y?=?T.ivector('y')???????????????????????????print?'...?building?the?model'????????????layer0_input?=?x.reshape((batch_size,?1,?28,?28))??????????????????layer0?=?LeNetConvPoolLayer(??????????rng,??????????input=layer0_input,??????????image_shape=(batch_size,?1,?28,?28),??????????filter_shape=(nkerns[0],?1,?5,?5),??????????poolsize=(2,?2)??????)????????????????????layer1?=?LeNetConvPoolLayer(??????????rng,??????????input=layer0.output,??????????image_shape=(batch_size,?nkerns[0],?12,?12),??????????filter_shape=(nkerns[1],?nkerns[0],?5,?5),??????????poolsize=(2,?2)??????)????????????????????layer2_input?=?layer1.output.flatten(2)??????layer2?=?HiddenLayer(??????????rng,??????????input=layer2_input,??????????n_in=nkerns[1]?*?4?*?4,??????????n_out=500,??????????activation=T.tanh??????)????????????layer3?=?LogisticRegression(input=layer2.output,?n_in=500,?n_out=10)??????????cost?=?layer3.negative_log_likelihood(y)??????????????test_model?=?theano.function(??????????[index],??????????layer3.errors(y),??????????givens={??????????????x:?test_set_x[index?*?batch_size:?(index?+?1)?*?batch_size],??????????????y:?test_set_y[index?*?batch_size:?(index?+?1)?*?batch_size]??????????}??????)????????validate_model?=?theano.function(??????????[index],??????????layer3.errors(y),??????????givens={??????????????x:?valid_set_x[index?*?batch_size:?(index?+?1)?*?batch_size],??????????????y:?valid_set_y[index?*?batch_size:?(index?+?1)?*?batch_size]??????????}??????)????????????????params?=?layer3.params?+?layer2.params?+?layer1.params?+?layer0.params??????????????grads?=?T.grad(cost,?params)??????????updates?=?[??????????(param_i,?param_i?-?learning_rate?*?grad_i)??????????for?param_i,?grad_i?in?zip(params,?grads)??????]??????????train_model?=?theano.function(??????????[index],??????????cost,??????????updates=updates,??????????givens={??????????????x:?train_set_x[index?*?batch_size:?(index?+?1)?*?batch_size],??????????????y:?train_set_y[index?*?batch_size:?(index?+?1)?*?batch_size]??????????}??????)????????????????????????????print?'...?training'??????patience?=?10000????????patience_increase?=?2????????improvement_threshold?=?0.995????????????????????????????????????????????validation_frequency?=?min(n_train_batches,?patience?/?2)???????????best_validation_loss?=?numpy.inf?????????best_iter?=?0????????????????????????????test_score?=?0.??????start_time?=?time.clock()????????epoch?=?0??????done_looping?=?False????????????????????????while?(epoch?<?n_epochs)?and?(not?done_looping):??????????epoch?=?epoch?+?1??????????for?minibatch_index?in?xrange(n_train_batches):????????????????iter?=?(epoch?-?1)?*?n_train_batches?+?minibatch_index????????????????if?iter?%?100?==?0:??????????????????print?'training?@?iter?=?',?iter??????????????cost_ij?=?train_model(minibatch_index)??????????????????if?(iter?+?1)?%?validation_frequency?==?0:??????????????????????????????????????validation_losses?=?[validate_model(i)?for?i???????????????????????????????????????in?xrange(n_valid_batches)]??????????????????this_validation_loss?=?numpy.mean(validation_losses)??????????????????print('epoch?%i,?minibatch?%i/%i,?validation?error?%f?%%'?%????????????????????????(epoch,?minibatch_index?+?1,?n_train_batches,?????????????????????????this_validation_loss?*?100.))???????????????????????if?this_validation_loss?<?best_validation_loss:??????????????????????????????????????????????if?this_validation_loss?<?best_validation_loss?*??\?????????????????????????improvement_threshold:??????????????????????????patience?=?max(patience,?iter?*?patience_increase)??????????????????????????????????????????????best_validation_loss?=?this_validation_loss??????????????????????best_iter?=?iter?????????????????????????????????????????????test_losses?=?[??????????????????????????test_model(i)??????????????????????????for?i?in?xrange(n_test_batches)??????????????????????]??????????????????????test_score?=?numpy.mean(test_losses)??????????????????????print(('?????epoch?%i,?minibatch?%i/%i,?test?error?of?'?????????????????????????????'best?model?%f?%%')?%????????????????????????????(epoch,?minibatch_index?+?1,?n_train_batches,?????????????????????????????test_score?*?100.))????????????????if?patience?<=?iter:??????????????????done_looping?=?True??????????????????break????????end_time?=?time.clock()??????print('Optimization?complete.')??????print('Best?validation?score?of?%f?%%?obtained?at?iteration?%i,?'????????????'with?test?performance?%f?%%'?%????????????(best_validation_loss?*?100.,?best_iter?+?1,?test_score?*?100.))??????print?>>?sys.stderr,?('The?code?for?file?'?+????????????????????????????os.path.split(__file__)[1]?+????????????????????????????'?ran?for?%.2fm'?%?((end_time?-?start_time)?/?60.))??
文章完,經詳細注釋的代碼和原始代碼:放在我的github地址上,可下載。
如果有任何錯誤,或者有說不清楚的地方,歡迎評論留言。
總結
以上是生活随笔為你收集整理的DeepLearning tutorial(4)CNN卷积神经网络原理简介+代码详解的全部內容,希望文章能夠幫你解決所遇到的問題。
如果覺得生活随笔網站內容還不錯,歡迎將生活随笔推薦給好友。