Caffe学习记录:Cifar-10 自定义网络训练记录
轉載自:
http://blog.csdn.net/linj_m/article/details/49428601
本部分的實驗主要記錄調(diào)整網(wǎng)絡的過程,并記錄實驗結果。——Jeremy
模型1
| conv1 | output: 32, kernel: 5, stride: 1 pad: 2 |
| pool1 | pool: MAX, kernel: 3, stride: 2 |
| relu1 | Sigmoid |
| norm1 | LRN |
| ip1 | 200 |
| ip2 | 10 |
實驗結果:
- Iteration 60000, loss = 0.801631
- Iteration 60000, Testing net (#0)?
Test net output #0:?accuracy = 0.5287?
Test net output #1: loss = 1.47174 (* 1 = 1.47174 loss)
從實驗結果可以知道,當前模型的識別效果較差,只有?52.87%?的識別率,同時比較Train過程中的loss和測試過程中的loss,可以知道,二者相差較大,出現(xiàn)過擬合現(xiàn)象。。
下一個模型中,我們增加一個FC層,加大網(wǎng)絡可學習的參數(shù)個數(shù)。
模型2:(增加一個FC層)
| conv1 | output: 32, kernel: 5, stride: 1 pad: 2 |
| pool1 | pool: MAX, kernel: 3, stride: 2 |
| relu1 | Sigmoid |
| norm1 | LRN |
| ip1 | 200 |
| ip2 | 100 |
| ip3 | 10 |
實驗結果:
- Iteration 60000, loss = 0.609769
- Iteration 60000, Testing net (#0)?
Test net output #0:?accuracy = 0.5467?
Test net output #1: loss = 1.40912 (* 1 = 1.40912 loss)
此結果與模型1相比,其識別效果有所提升,從52.87%?提高到?54.67%?。訓練階段的loss和測試階段的loss都有所降低。因此,下一步我們再增加2個FC層的輸出個數(shù),來看看識別效果是否會進一步提升。
模型3:(增加2個FC層的的輸出個數(shù))
| conv1 | output: 32, kernel: 5, stride: 1 pad: 2 |
| pool1 | pool: MAX, kernel: 3, stride: 2 |
| relu1 | Sigmoid |
| norm1 | LRN |
| ip1 | 400 |
| ip2 | 200 |
| ip3 | 10 |
實驗結果:
- Iteration 60000, loss = 0.603354
- Iteration 60000, Testing net (#0)?
Test net output #0:?accuracy =0.5423?
Test net output #1: loss = 1.45162
從模型3的實驗結果中,我們可以發(fā)現(xiàn),其識別效果并沒有得到提升,可見,單獨地增加FC層的輸出個數(shù)并不是一個很好的方法。
因此,我在下一步中,考慮卷積層對識別結果的影響。在模型3的基礎上增加一個卷積層。
模型4:(增加一個卷積層)
| conv1 | output: 32, kernel: 5, stride: 1 pad: 2 |
| pool1 | pool: MAX, kernel: 3, stride: 2 |
| relu1 | Sigmoid |
| norm1 | LRN |
| conv2 | output: 16, kernel: 5, stride: 1 pad: 2 |
| ip1 | 400 |
| ip2 | 200 |
| ip3 | 10 |
實驗結果:
- Iteration 60000, loss = 0.690654
- Iteration 60000, Testing net (#0)?
Test net output #0:?accuracy = 0.5815?
Test net output #1: loss = 1.24061
從模型4的實驗結果中,我們可以發(fā)現(xiàn),增加了一個卷積層后,訓練階段的loss相比于模型3,提高了0.0873.而測試階段的loss降低了0.21101.總的來說,這是一個好的現(xiàn)象,它在一定程度上降低了過擬合。?
在下一個模型中,我們在第2個卷積層后面添加一個MaxPooling,同時降低FC層的輸出。
模型5:(增加一個max poooling)
| conv1 | output: 32, kernel: 5, stride: 1 pad: 2 |
| pool1 | pool: MAX, kernel: 3, stride: 2 |
| relu1 | Sigmoid |
| norm1 | LRN |
| conv2 | output: 16, kernel: 5, stride: 1 pad: 2 |
| pool2 | pool: MAX, kernel: 3, stride: 2 |
| ip1 | 200 |
| ip2 | 100 |
| ip3 | 10 |
實驗結果:
- Iteration 60000, loss = 0.73147
- Iteration 60000, Testing net (#0)?
Test net output #0:?accuracy = 0.6335?
Test net output #1: loss = 1.06225
增加pool層后,訓練階段和測試階段的loss的差別,進一步變小,過擬合有所降低,不過識別效果還是不佳,因此,在下一步中我考慮在pool層后面增加一個Sigmoid層。
模型6:(增加一個Sigmoid層)
| conv1 | output: 32, kernel: 5, stride: 1 pad: 2 |
| pool1 | pool: MAX, kernel: 3, stride: 2 |
| relu1 | Sigmoid |
| norm1 | LRN |
| conv2 | output: 16, kernel: 5, stride: 1 pad: 2 |
| pool2 | pool: MAX, kernel: 3, stride: 2 |
| relu1 | Sigmoid |
| ip1 | 200 |
| ip2 | 100 |
| ip3 | 10 |
實驗結果:
- Iteration 60000, loss = 2.29934
- Iteration 60000, Testing net (#0)?
Test net output #0:?accuracy = 0.1?
Test net output #1: loss = 2.30292?
可見加了sigmoid層的方法在此模型中無效。
模型7:(換成ReLU層)
| conv1 | output: 32, kernel: 5, stride: 1 pad: 2 |
| pool1 | pool: MAX, kernel: 3, stride: 2 |
| relu1 | Sigmoid |
| norm1 | LRN |
| conv2 | output: 16, kernel: 5, stride: 1 pad: 2 |
| pool2 | pool: MAX, kernel: 3, stride: 2 |
| relu1 | ReLU |
| ip1 | 200 |
| ip2 | 100 |
| ip3 | 10 |
實驗結果:
- Iteration 60000, loss = 0.620338
- Iteration 60000, Testing net (#0)?
Test net output #0:?accuracy = 0.6391?
Test net output #1: loss = 1.05354
模型8:(全部換成ReLU層)
| conv1 | output: 32, kernel: 5, stride: 1 pad: 2 |
| pool1 | pool: MAX, kernel: 3, stride: 2 |
| relu1 | ReLU |
| norm1 | LRN |
| conv2 | output: 16, kernel: 5, stride: 1 pad: 2 |
| pool2 | pool: MAX, kernel: 3, stride: 2 |
| relu1 | ReLU |
| ip1 | 200 |
| ip2 | 100 |
| ip3 | 10 |
實驗結果:
- Iteration 60000, loss = 0.416507
- Iteration 60000, Testing net (#0)?
Test net output #0:?accuracy = 0.6794?
Test net output #1: loss = 1.15119
在兩個卷積層后面加上ReLU層后,識別效果提升了較多,識別率為67.94%。
模型9:(加一個Dropout層)
| conv1 | output: 32, kernel: 5, stride: 1 pad: 2 |
| pool1 | pool: MAX, kernel: 3, stride: 2 |
| relu1 | ReLU |
| norm1 | LRN |
| conv2 | output: 16, kernel: 5, stride: 1 pad: 2 |
| pool2 | pool: MAX, kernel: 3, stride: 2 |
| relu1 | ReLU |
| ip1 | 200 |
| dropout | dropout, 0.5 |
| ip2 | 100 |
| ip3 | 10 |
實驗結果:
- Iteration 60000, loss = 0.563472
- Iteration 60000, Testing net (#0)?
Test net output #0:?accuracy = 0.6728?
Test net output #1: loss = 1.03333?
從實驗結果可以知道,加了Dropout層之后,雖然沒有提高識別效果,但是降低了過擬合。因此,下一步增加FC層的輸出看看。
模型10:(增加FC層的輸出個數(shù))
| conv1 | output: 32, kernel: 5, stride: 1 pad: 2 |
| pool1 | pool: MAX, kernel: 3, stride: 2 |
| relu1 | ReLU |
| norm1 | LRN |
| conv2 | output: 16, kernel: 5, stride: 1 pad: 2 |
| pool2 | pool: MAX, kernel: 3, stride: 2 |
| relu1 | ReLU |
| ip1 | 400 |
| dropout | dropout, 0.5 |
| ip2 | 150 |
| ip3 | 10 |
實驗結果:
- Iteration 60000, loss = 0.446714
- Iteration 60000, Testing net (#0)?
Test net output #0:?accuracy = 0.6903?
Test net output #1: loss = 0.990431
模型11:(再增加一個Dropout)
| conv1 | output: 32, kernel: 5, stride: 1 pad: 2 |
| pool1 | pool: MAX, kernel: 3, stride: 2 |
| relu1 | ReLU |
| norm1 | LRN |
| conv2 | output: 16, kernel: 5, stride: 1 pad: 2 |
| pool2 | pool: MAX, kernel: 3, stride: 2 |
| relu1 | ReLU |
| ip1 | 400 |
| dropout | dropout, 0.5 |
| ip2 | 200 |
| dropout | dropout, 0.5 |
| ip3 | 10 |
實驗結果:
- Iteration 60000, loss = 0.586936
- Iteration 60000, Testing net (#0)?
Test net output #0:?accuracy = 0.7013?
Test net output #1: loss = 0.92605
模型12:(調(diào)整卷積層的輸出)
| conv1 | output: 48, kernel: 5, stride: 1 pad: 2 |
| pool1 | pool: MAX, kernel: 3, stride: 2 |
| relu1 | ReLU |
| norm1 | LRN |
| conv2 | output: 32, kernel: 5, stride: 1 pad: 2 |
| pool2 | pool: MAX, kernel: 3, stride: 2 |
| relu1 | ReLU |
| ip1 | 400 |
| dropout | dropout, 0.5 |
| ip2 | 200 |
| dropout | dropout, 0.5 |
| ip3 | 10 |
實驗結果:
- Iteration 60000, loss = 0.273988
- Iteration 60000, Testing net (#0)?
Test net output #0:?accuracy = 0.7088?
Test net output #1: loss = 1.1117
總結
以上是生活随笔為你收集整理的Caffe学习记录:Cifar-10 自定义网络训练记录的全部內(nèi)容,希望文章能夠幫你解決所遇到的問題。
- 上一篇: Programming Computer
- 下一篇: Python-OpenCV 处理视频(一