经典网络LeNet-5介绍及代码测试(Caffe, MNIST, C++)
LeNet-5:包含7個層(layer),如下圖所示:輸入層沒有計算在內(nèi),輸入圖像大小為32*32*1,是針對灰度圖進行訓(xùn)練和預(yù)測的。論文名字為” Gradient-Based Learning Applied to Document Recognition”,可以直接從http://yann.lecun.com/exdb/publis/pdf/lecun-01a.pdf 下載原始論文。
第一層是卷積層,使用6個5*5的filter,stride為1,padding為0,輸出結(jié)果為28*28*6,6個feature maps,訓(xùn)練參數(shù)(5*5*1)*6+6=156(weights + bias);
第二層進行平均池化操作,filter為2*2,stride為2,padding為0,輸出結(jié)果為14*14*6,6個feature maps,訓(xùn)練參數(shù)1*6+6=12(coefficient + bias);
第三層是卷積層,使用16個5*5的filter,stride為1,padding為0,輸出結(jié)果為10*10*16,16個feature maps, 訓(xùn)練參數(shù)(按照論文給的連接方式) (5*5*3+1)*6 + (5*5*4+1)*6+(5*5*4+1)*3+(5*5*6+1)*1 = 1516(weights + bias);
第四層又是平均池化層,filter為2*2,stride為2,padding為0,輸出結(jié)果為5*5*16,16個feature maps,訓(xùn)練參數(shù)1*16+16=32(coefficient + bias);
第五層是卷積層,使用120個5*5的fiter,stride為1,輸出結(jié)果為1*1*120,120個feature maps,訓(xùn)練參數(shù)(5*5*6)*120+120=48120(weights + bias);
第六層是一個全連接層,有84個神經(jīng)元,訓(xùn)練參數(shù)120*84+84=10164(weights + bias);此layer使用的激活函數(shù)為tanh。
第七層得到最后的輸出預(yù)測y’的值,y’有10個可能的值,對應(yīng)識別0--9這10個數(shù)字,在現(xiàn)在的版本中,使用softmax函數(shù)輸出10種分類結(jié)果,即第七層為softmax。
輸入圖像size為32*32比訓(xùn)練數(shù)據(jù)集圖像size為28*28大的原因:期望諸如筆劃終點(stroke end-points)或角點(corner)這些潛在的特征(potential distinctive feature)能夠出現(xiàn)在最高層特征檢測器(highest-level feature detectors)的感受野的中心。
沒有把layer2中的每個feature map連接到layer3中的每個feature map原因:(1). 不完全的連接機制將連接的數(shù)量保持在合理的范圍內(nèi);(2). 更重要的,它強制破壞了網(wǎng)絡(luò)的對稱性。因為不同的feature maps來自不同的輸入,所以不同的feature maps被強制提取不同的features.(The main reason is to break the symmetry in the network and keeps the number of connections within reasonable bounds.)
LeNet-5中的5是指5個隱藏層即卷積層、池化層、卷積層、池化層、卷積層。
不同的filter可以提取不同的特征,如邊沿、線性、角等特征。
關(guān)于卷積神經(jīng)網(wǎng)絡(luò)的基礎(chǔ)介紹可參考之前的blog:
?????????????????? https://blog.csdn.net/fengbingchun/article/details/50529500
?????????????????? https://blog.csdn.net/fengbingchun/article/details/80262495
?????????????????? https://blog.csdn.net/fengbingchun/article/details/68065338
?????????????????? https://blog.csdn.net/fengbingchun/article/details/69001433
以下是參考Caffe中的測試代碼對LeNet-5網(wǎng)絡(luò)進行測試的代碼,與論文中的不同處包括:
(1). 論文中要求輸入層圖像大小為32*32,這里為28*28;
(2). 論文中第一層卷積層輸出是6個feature maps,這里是20個feature maps;
(3). 論文中池化層取均值,而這里取最大值;
(4). 論文中第三層卷積層輸出是16個feature maps,這里是50個feature maps,而且這里第二層的feature map是連接到第三層的每個feature map的;
(5). 論文中第五層是卷積層,這里是全連接層并輸出500個神經(jīng)元,激活函數(shù)采用ReLU;
(6). 論文中第七層是RBF(Euclidean Radial Basic Function),這里采用Softmax。
以下是測試代碼(lenet-5.cpp):
#include "funset.hpp"
#include "common.hpp"int lenet_5_mnist_train()
{
#ifdef CPU_ONLYcaffe::Caffe::set_mode(caffe::Caffe::CPU);
#elsecaffe::Caffe::set_mode(caffe::Caffe::GPU);
#endif#ifdef _MSC_VERconst std::string filename{ "E:/GitCode/Caffe_Test/test_data/Net/lenet-5_mnist_windows_solver.prototxt" };
#elseconst std::string filename{ "test_data/Net/lenet-5_mnist_linux_solver.prototxt" };
#endifcaffe::SolverParameter solver_param;if (!caffe::ReadProtoFromTextFile(filename.c_str(), &solver_param)) {fprintf(stderr, "parse solver.prototxt fail\n");return -1;}mnist_convert(); // convert MNIST to LMDBcaffe::SGDSolver<float> solver(solver_param);solver.Solve();fprintf(stdout, "train finish\n");return 0;
}int lenet_5_mnist_test()
{
#ifdef CPU_ONLYcaffe::Caffe::set_mode(caffe::Caffe::CPU);
#elsecaffe::Caffe::set_mode(caffe::Caffe::GPU);
#endif#ifdef _MSC_VERconst std::string param_file{ "E:/GitCode/Caffe_Test/test_data/Net/lenet-5_mnist_windows_test.prototxt" };const std::string trained_filename{ "E:/GitCode/Caffe_Test/test_data/Net/lenet-5_mnist_iter_10000.caffemodel" };const std::string image_path{ "E:/GitCode/Caffe_Test/test_data/images/handwritten_digits/" };
#elseconst std::string param_file{ "test_data/Net/lenet-5_mnist_linux_test.prototxt" };const std::string trained_filename{ "test_data/Net/lenet-5_mnist_iter_10000.caffemodel" };const std::string image_path{ "test_data/images/handwritten_digits/" };
#endifcaffe::Net<float> caffe_net(param_file, caffe::TEST);caffe_net.CopyTrainedLayersFrom(trained_filename);const boost::shared_ptr<caffe::Blob<float> > blob_data_layer = caffe_net.blob_by_name("data");int image_channel_data_layer = blob_data_layer->channels();int image_height_data_layer = blob_data_layer->height();int image_width_data_layer = blob_data_layer->width();const std::vector<caffe::Blob<float>*> output_blobs = caffe_net.output_blobs();int require_blob_index{ -1 };const int digit_category_num{ 10 };for (int i = 0; i < output_blobs.size(); ++i) {if (output_blobs[i]->count() == digit_category_num)require_blob_index = i;}if (require_blob_index == -1) {fprintf(stderr, "ouput blob don't match\n");return -1;}std::vector<int> target{ 0, 1, 2, 3, 4, 5, 6, 7, 8, 9 };std::vector<int> result;for (auto num : target) {std::string str = std::to_string(num);str += ".png";str = image_path + str;cv::Mat mat = cv::imread(str.c_str(), 1);if (!mat.data) {fprintf(stderr, "load image error: %s\n", str.c_str());return -1;}if (image_channel_data_layer == 1)cv::cvtColor(mat, mat, CV_BGR2GRAY);else if (image_channel_data_layer == 4)cv::cvtColor(mat, mat, CV_BGR2BGRA);cv::resize(mat, mat, cv::Size(image_width_data_layer, image_height_data_layer));cv::bitwise_not(mat, mat);boost::shared_ptr<caffe::MemoryDataLayer<float> > memory_data_layer =boost::static_pointer_cast<caffe::MemoryDataLayer<float>>(caffe_net.layer_by_name("data"));mat.convertTo(mat, CV_32FC1, 0.00390625);float dummy_label[1] {0};memory_data_layer->Reset((float*)(mat.data), dummy_label, 1);float loss{ 0.0 };const std::vector<caffe::Blob<float>*>& results = caffe_net.ForwardPrefilled(&loss);const float* output = results[require_blob_index]->cpu_data();float tmp{ -1 };int pos{ -1 };for (int j = 0; j < 10; j++) {//fprintf(stdout, "Probability to be Number %d is: %.3f\n", j, output[j]);if (tmp < output[j]) {pos = j;tmp = output[j];}}result.push_back(pos);}for (auto i = 0; i < 10; i++)fprintf(stdout, "actual digit is: %d, result digit is: %d\n", target[i], result[i]);fprintf(stdout, "predict finish\n");return 0;
}
solver.prototxt文件內(nèi)容如下:
# solver.prototxt是一個配置文件用來告知Caffe怎樣對網(wǎng)絡(luò)進行訓(xùn)練
# 其文件內(nèi)的各字段名需要在caffe.proto的message SolverParameter中存在,否則會解析不成功net: "test_data/Net/lenet-5_mnist_linux_train.prototxt" # 訓(xùn)練網(wǎng)絡(luò)文件名
test_iter: 100 # test_iter * test_batch_size = 測試圖像總數(shù)量
test_interval: 500 # 指定執(zhí)行多少次訓(xùn)練網(wǎng)絡(luò)執(zhí)行一次測試網(wǎng)絡(luò)
base_lr: 0.01 # 學(xué)習(xí)率
lr_policy: "inv" # 學(xué)習(xí)策略, return base_lr * (1 + gamma * iter) ^ (- power)
momentum: 0.9 # 動量
weight_decay: 0.0005 # 權(quán)值衰減
gamma: 0.0001 # 學(xué)習(xí)率計算參數(shù)
power: 0.75 # 學(xué)習(xí)率計算參數(shù)
display: 100 # 指定訓(xùn)練多少次在屏幕上顯示一次結(jié)果信息,如loss值等
max_iter: 10000 # 最多訓(xùn)練次數(shù)
snapshot: 5000 # 執(zhí)行多少次訓(xùn)練保存一次中間結(jié)果
snapshot_prefix: "test_data/Net/lenet-5_mnist" # 結(jié)果保存位置前綴
solver_type: SGD # 隨機梯度下降
train時的prototxt文件內(nèi)容如下:
name: "LeNet-5"layer {name: "mnist"type: "Data"top: "data"top: "label"include {phase: TRAIN}transform_param {scale: 0.00390625}data_param {source: "test_data/MNIST/train"batch_size: 64backend: LMDB}
}
layer {name: "mnist"type: "Data"top: "data"top: "label"include {phase: TEST}transform_param {scale: 0.00390625}data_param {source: "test_data/MNIST/test"batch_size: 100backend: LMDB}
}
layer {name: "conv1"type: "Convolution"bottom: "data"top: "conv1"param {lr_mult: 1}param {lr_mult: 2}convolution_param {num_output: 20kernel_size: 5stride: 1weight_filler {type: "xavier"}bias_filler {type: "constant"}}
}
layer {name: "pool1"type: "Pooling"bottom: "conv1"top: "pool1"pooling_param {pool: MAXkernel_size: 2stride: 2}
}
layer {name: "conv2"type: "Convolution"bottom: "pool1"top: "conv2"param {lr_mult: 1}param {lr_mult: 2}convolution_param {num_output: 50kernel_size: 5stride: 1weight_filler {type: "xavier"}bias_filler {type: "constant"}}
}
layer {name: "pool2"type: "Pooling"bottom: "conv2"top: "pool2"pooling_param {pool: MAXkernel_size: 2stride: 2}
}
layer {name: "ip1"type: "InnerProduct"bottom: "pool2"top: "ip1"param {lr_mult: 1}param {lr_mult: 2}inner_product_param {num_output: 500weight_filler {type: "xavier"}bias_filler {type: "constant"}}
}
layer {name: "relu1"type: "ReLU"bottom: "ip1"top: "ip1"
}
layer {name: "ip2"type: "InnerProduct"bottom: "ip1"top: "ip2"param {lr_mult: 1}param {lr_mult: 2}inner_product_param {num_output: 10weight_filler {type: "xavier"}bias_filler {type: "constant"}}
}
layer {name: "accuracy"type: "Accuracy"bottom: "ip2"bottom: "label"top: "accuracy"include {phase: TEST}
}
layer {name: "loss"type: "SoftmaxWithLoss"bottom: "ip2"bottom: "label"top: "loss"
}
train.prototxt可視化結(jié)果如下:
test時的test.prototxt文件內(nèi)容如下:
name: "LeNet-5"layer {name: "data"type: "MemoryData"top: "data" #top: "label"memory_data_param {batch_size: 1channels: 1height: 28width: 28}transform_param {scale: 0.00390625}
}
layer {name: "conv1"type: "Convolution"bottom: "data"top: "conv1"param {lr_mult: 1}param {lr_mult: 2}convolution_param {num_output: 20kernel_size: 5stride: 1weight_filler {type: "xavier"}bias_filler {type: "constant"}}
}
layer {name: "pool1"type: "Pooling"bottom: "conv1"top: "pool1"pooling_param {pool: MAXkernel_size: 2stride: 2}
}
layer {name: "conv2"type: "Convolution"bottom: "pool1"top: "conv2"param {lr_mult: 1}param {lr_mult: 2}convolution_param {num_output: 50kernel_size: 5stride: 1weight_filler {type: "xavier"}bias_filler {type: "constant"}}
}
layer {name: "pool2"type: "Pooling"bottom: "conv2"top: "pool2"pooling_param {pool: MAXkernel_size: 2stride: 2}
}
layer {name: "ip1"type: "InnerProduct"bottom: "pool2"top: "ip1"param {lr_mult: 1}param {lr_mult: 2}inner_product_param {num_output: 500weight_filler {type: "xavier"}bias_filler {type: "constant"}}
}
layer {name: "relu1"type: "ReLU"bottom: "ip1"top: "ip1"
}
layer {name: "ip2"type: "InnerProduct"bottom: "ip1"top: "ip2"param {lr_mult: 1}param {lr_mult: 2}inner_product_param {num_output: 10weight_filler {type: "xavier"}bias_filler {type: "constant"}}
}
layer {name: "prob"type: "Softmax"bottom: "ip2"top: "prob"
}
test.prototxt可視化結(jié)果如下:
train時測試代碼執(zhí)行結(jié)果如下:
test是測試代碼執(zhí)行結(jié)果如下:
GitHub:https://github.com/fengbingchun/Caffe_Test
?
總結(jié)
以上是生活随笔為你收集整理的经典网络LeNet-5介绍及代码测试(Caffe, MNIST, C++)的全部內(nèi)容,希望文章能夠幫你解決所遇到的問題。
- 上一篇: YAML简介
- 下一篇: ImageNet图像数据集介绍