DL之SPP-Net:SPP-Net算法的简介(论文介绍)、架构详解、案例应用等配图集合之详细攻略
DL之SPP-Net:SPP-Net算法的簡(jiǎn)介(論文介紹)、架構(gòu)詳解、案例應(yīng)用等配圖集合之詳細(xì)攻略
?
?
?
目錄
SPP-Net算法的相關(guān)論文
0、實(shí)驗(yàn)結(jié)果
1、SPP-Net中的亮點(diǎn)
SPP-Net算法的設(shè)計(jì)思路
SPP-Net關(guān)鍵步驟
1、ROI池化層
2、卷積特征實(shí)際上和原始圖像在位置上是有一定對(duì)應(yīng)關(guān)系
?
?
?
?
相關(guān)文章
DL之SPP-Net:SPP-Net算法的簡(jiǎn)介(論文介紹)、架構(gòu)詳解、案例應(yīng)用等配圖集合之詳細(xì)攻略
DL之SPP-Net:SPP-Net算法的架構(gòu)詳解
SPP-Net算法的相關(guān)論文
? ? ? SPP-Net的第一作者也是何凱明,原論文《Spatial Pyramid Pooling in Deep Convolutional Networks for Visual Recognition》。用于分類和檢測(cè)任務(wù),在ImageNet數(shù)據(jù)集ILSVRC2014競(jìng)賽上,檢測(cè)任務(wù)獲得第二名、分類任務(wù)第三名。
Abstract
? ? ? ?Existing deep convolutional neural networks (CNNs) require a fixed-size (e.g., 224×224) input image. This requirement ?is “artificial” and may reduce the recognition accuracy for the images or sub-images of an arbitrary size/scale. In this ?work, we equip the networks with another pooling strategy, “spatial pyramid pooling”, to eliminate the above requirement. The ?new network structure, called SPP-net, can generate a fixed-length representation regardless of image size/scale. Pyramid ?pooling is also robust to object deformations. With these advantages, SPP-net should in general improve all CNN-based image ?classification methods. On the ImageNet 2012 dataset, we demonstrate that SPP-net boosts the accuracy of a variety of CNN ?architectures despite their different designs. On the Pascal VOC 2007 and Caltech101 datasets, SPP-net achieves state-of-theart ?classification results using a single full-image representation and no fine-tuning. ?
? ? ? ?現(xiàn)有的深度卷積神經(jīng)網(wǎng)絡(luò)(CNNs)需要一個(gè)固定大小的輸入圖像(如224×224)。這一要求是“人為的”,可能會(huì)降低對(duì)任意大小/尺度的圖像或子圖像的識(shí)別精度。在這項(xiàng)工作中,我們?yōu)榫W(wǎng)絡(luò)配備了另一種pooling 策略,“空間金字塔池”,以消除上述的要求。這種新的網(wǎng)絡(luò)結(jié)構(gòu)稱為SPP-net,可以生成固定長(zhǎng)度的表示,而不受圖像大小/比例的影響。金字塔池對(duì)物體變形也有很強(qiáng)的魯棒性。基于這些優(yōu)點(diǎn),SPP-net一般應(yīng)改進(jìn)所有基于CNN的圖像分類方法。在ImageNet 2012數(shù)據(jù)集中,盡管它們的設(shè)計(jì)不同,我們證明了SPP-net提高了各種CNN架構(gòu)的準(zhǔn)確性。在Pascal VOC 2007和Caltech101數(shù)據(jù)集上,SPP-net使用單一的全圖像表示,無(wú)需微調(diào),就可以實(shí)現(xiàn)最先進(jìn)的分類結(jié)果。
? ? ? ?The power of SPP-net is also significant in object detection. Using SPP-net, we compute the feature maps from the entire ?image only once, and then pool features in arbitrary regions (sub-images) to generate fixed-length representations for training ?the detectors. This method avoids repeatedly computing the convolutional features. In processing test images, our method is ?24-102× faster than the R-CNN method, while achieving better or comparable accuracy on Pascal VOC 2007. ?
? ? ? ?在目標(biāo)檢測(cè)中,SPP-net的能力也很重要。利用SPP-net算法,只對(duì)整個(gè)圖像進(jìn)行一次特征映射計(jì)算,然后將特征集合到任意區(qū)域(子圖像),生成固定長(zhǎng)度的表示形式,用于訓(xùn)練檢測(cè)器。該方法避免了卷積特征的重復(fù)計(jì)算。在處理測(cè)試圖像時(shí),我們的方法比R-CNN方法快24-102倍,而在Pascal VOC 2007上達(dá)到了更好或相近的精度。
? ? ? ?In ImageNet Large Scale Visual Recognition Challenge (ILSVRC) 2014, our methods rank #2 in object detection and #3 in ?image classification among all 38 teams. This manuscript also introduces the improvement made for this competition.
? ? ? ?在2014年的ImageNet Large Scale Visual Recognition Challenge (ILSVRC)中,我們的方法在所有38個(gè)團(tuán)隊(duì)中對(duì)象檢測(cè)排名第二,圖像分類排名第三。本文還介紹了本次比賽的改進(jìn)情況。
CONCLUSION ?
? ? ? ?SPP is a flexible solution for handling different scales, ?sizes, and aspect ratios. These issues are important in ?visual recognition, but received little consideration in ?the context of deep networks. We have suggested a solution ?to train a deep network with a spatial pyramid ?pooling layer. The resulting SPP-net shows outstanding ?accuracy in classification/detection tasks and ?greatly accelerates DNN-based detection. Our studies ?also show that many time-proven techniques/insights ?in computer vision can still play important roles in ?deep-networks-based recognition.
結(jié)論
? ? ? ?SPP是一個(gè)靈活的解決方案,可以處理不同的規(guī)模、大小和縱橫比。這些問(wèn)題在視覺(jué)識(shí)別中很重要,但在深度網(wǎng)絡(luò)環(huán)境中卻很少被考慮。論文提出了一種利用空間金字塔池層,訓(xùn)練深度網(wǎng)絡(luò)的方法。由此產(chǎn)生的SPP-net在分類/檢測(cè)任務(wù)中顯示出優(yōu)異的精度,大大加快了基于DNN的檢測(cè)速度。我們的研究還表明,在基于深度網(wǎng)絡(luò)的識(shí)別中,許多經(jīng)過(guò)時(shí)間檢驗(yàn)的計(jì)算機(jī)視覺(jué)技術(shù)/見(jiàn)解仍然可以發(fā)揮重要作用。
相關(guān)論文
Kaiming He, XiangyuZhang, ShaoqingRen, and Jian Sun.
Spatial Pyramid Pooling in Deep Convolutional Networks for Visual Recognition . ECCV 2014
https://arxiv.org/abs/1406.4729
?
0、實(shí)驗(yàn)結(jié)果
1、VOC2007
2、ILSVRC 2014 Classification
?
3、ILSVRC 2014 Detection
?
?
1、SPP-Net中的亮點(diǎn)
? ? ?在此之前,所有的神經(jīng)網(wǎng)絡(luò)都是需要輸入固定尺寸的圖片,比如224*224(ImageNet)、32*32(LenNet)、96*96等。這樣對(duì)于我們希望檢測(cè)各種大小的圖片的時(shí)候,需要經(jīng)過(guò)crop,或者warp等一系列操作,這都在一定程度上導(dǎo)致圖片信息的丟失和變形,限制了識(shí)別精確度。而且,從生理學(xué)角度出發(fā),人眼看到一個(gè)圖片時(shí),大腦會(huì)首先認(rèn)為這是一個(gè)整體,而不會(huì)進(jìn)行crop和warp,所以更有可能的是,我們的大腦通過(guò)搜集一些淺層的信息,在更深層才識(shí)別出這些任意形狀的目標(biāo)。
- 分類: improves all CNN architectures
- 檢測(cè): 24~64x faster than R-CNN
- ILSVRC 2014: #2 in detection, #3 in classification.
?
?
SPP-Net算法的設(shè)計(jì)思路
?
SPP-Net關(guān)鍵步驟
1、ROI池化層
2、卷積特征實(shí)際上和原始圖像在位置上是有一定對(duì)應(yīng)關(guān)系
?
?
?
?
?
?
?
總結(jié)
以上是生活随笔為你收集整理的DL之SPP-Net:SPP-Net算法的简介(论文介绍)、架构详解、案例应用等配图集合之详细攻略的全部?jī)?nèi)容,希望文章能夠幫你解決所遇到的問(wèn)題。
- 上一篇: DL之FastR-CNN:Fast R-
- 下一篇: Windows系统下使用protobuf