DL之SegNet:SegNet图像分割/语义分割算法的简介(论文介绍)、架构详解、案例应用等配图集合之详细攻略
DL之SegNet:SegNet圖像分割/語義分割算法的簡介(論文介紹)、架構(gòu)詳解、案例應(yīng)用等配圖集合之詳細(xì)攻略
導(dǎo)讀
基于CNN的神經(jīng)網(wǎng)絡(luò)SegNet算法可進(jìn)行高精度地識別行駛環(huán)境。
目錄
SegNet圖像分割算法的簡介(論文介紹)
0、實(shí)驗(yàn)結(jié)果
1、SegNet算法的關(guān)鍵思路
SegNet圖像分割算法的架構(gòu)詳解
SegNet圖像分割算法的案例應(yīng)用
相關(guān)文章
DL之SegNet:SegNet圖像分割算法的簡介(論文介紹)、架構(gòu)詳解、案例應(yīng)用等配圖集合之詳細(xì)攻略
DL之SegNet:SegNet圖像分割算法的架構(gòu)詳解
SegNet圖像分割算法的簡介(論文介紹)
更新……
Abstract
? ? ? ?We present a novel and practical deep fully convolutional neural network architecture for semantic pixel-wise segmentation ?termed SegNet. This core trainable segmentation engine consists of an encoder network, a corresponding decoder network followed ?by a pixel-wise classification layer. The architecture of the encoder network is topologically identical to the 13 convolutional layers in the ?VGG16 network [1]. The role of the decoder network is to map the low resolution encoder feature maps to full input resolution feature ?maps for pixel-wise classification. The novelty of SegNet lies is in the manner in which the decoder upsamples its lower resolution input ?feature map(s). Specifically, the decoder uses pooling indices computed in the max-pooling step of the corresponding encoder to ?perform non-linear upsampling. This eliminates the need for learning to upsample. The upsampled maps are sparse and are then ?convolved with trainable filters to produce dense feature maps. We compare our proposed architecture with the widely adopted FCN [2] ?and also with the well known DeepLab-LargeFOV [3], DeconvNet [4] architectures. This comparison reveals the memory versus ?accuracy trade-off involved in achieving good segmentation performance. ?SegNet was primarily motivated by scene understanding applications. Hence, it is designed to be efficient both in terms of memory and ?computational time during inference. It is also significantly smaller in the number of trainable parameters than other competing ?architectures and can be trained end-to-end using stochastic gradient descent. We also performed a controlled benchmark of SegNet ?and other architectures on both road scenes and SUN RGB-D indoor scene segmentation tasks. These quantitative assessments ?show that SegNet provides good performance with competitive inference time and most efficient inference memory-wise as compared ?to other architectures. We also provide a Caffe implementation of SegNet and a web demo at?
http://mi.eng.cam.ac.uk/projects/segnet/.
? ? ? ?本文提出了一種新穎實(shí)用的深度全卷積神經(jīng)網(wǎng)絡(luò)結(jié)構(gòu)——SegNet。該核心的可訓(xùn)練分割引擎由編碼器網(wǎng)絡(luò)、相應(yīng)的解碼器網(wǎng)絡(luò)和像素級分類層組成。編碼器網(wǎng)絡(luò)的結(jié)構(gòu)在拓?fù)渖吓cVGG16網(wǎng)絡(luò)[1]中的13個(gè)卷積層相同。解碼器網(wǎng)絡(luò)的作用是將編碼器的低分辨率特征映射為全輸入分辨率特征映射,進(jìn)行像素級分類。SegNet lies的新穎之處在于解碼器向上采樣其低分辨率輸入特征圖的方式。具體地說,解碼器使用在相應(yīng)編碼器的最大池化步驟中計(jì)算的池化索引來執(zhí)行非線性上采樣。這消除了學(xué)習(xí)向上采樣的需要。上采樣后的圖像是稀疏的,然后與可訓(xùn)練濾波器進(jìn)行卷積,生成密集的特征圖。我們將我們提出的體系結(jié)構(gòu)與廣泛采用的FCN[2]以及著名的DeepLab-LargeFOV[3]、DeconvNet[4]體系結(jié)構(gòu)進(jìn)行了比較。這個(gè)比較揭示了在獲得良好的分割性能時(shí)所涉及的內(nèi)存和精度之間的權(quán)衡。SegNet主要是由場景理解應(yīng)用程序驅(qū)動的。因此,它的設(shè)計(jì)在內(nèi)存和推理過程中的計(jì)算時(shí)間方面都是高效的。它在可訓(xùn)練參數(shù)的數(shù)量上也明顯小于其他競爭架構(gòu),并且可以使用隨機(jī)梯度下降進(jìn)行端到端訓(xùn)練。我們還在道路場景和SUN RGB-D室內(nèi)場景分割任務(wù)上對SegNet等架構(gòu)進(jìn)行了受控基準(zhǔn)測試。這些定量評估表明,與其他體系結(jié)構(gòu)相比,SegNet具有良好的性能,推理時(shí)間有競爭力,并且在內(nèi)存方面推理效率最高。我們還提供了一個(gè)Caffe實(shí)現(xiàn)SegNet和一個(gè)web demo at
http://mi.eng.cam.ac.uk/projects/segnet/。
CONCLUSION ?
? ? ? ?We presented SegNet, a deep convolutional network architecture ?for semantic segmentation. The main motivation behind SegNet ?was the need to design an efficient architecture for road and indoor ?scene understanding which is efficient both in terms of memory ?and computational time. We analysed SegNet and compared it ?with other important variants to reveal the practical trade-offs ?involved in designing architectures for segmentation, particularly ?training time, memory versus accuracy. Those architectures which store the encoder network feature maps in full perform best but ?consume more memory during inference time. SegNet on the ?other hand is more efficient since it only stores the max-pooling ?indices of the feature maps and uses them in its decoder network ?to achieve good performance. On large and well known datasets ?SegNet performs competitively, achieving high scores for road ?scene understanding. End-to-end learning of deep segmentation ?architectures is a harder challenge and we hope to see more ?attention paid to this important problem.
? ? ? ?本文提出了一種用于語義分割的深度卷積網(wǎng)絡(luò)結(jié)構(gòu)SegNet。SegNet背后的主要動機(jī)是需要為道路和室內(nèi)場景理解設(shè)計(jì)一個(gè)高效的架構(gòu),它在內(nèi)存和計(jì)算時(shí)間方面都是高效的。我們分析了SegNet,并將其與其他重要的變體進(jìn)行了比較,以揭示在設(shè)計(jì)用于分割的架構(gòu)時(shí)所涉及的實(shí)際權(quán)衡,尤其是訓(xùn)練時(shí)間、內(nèi)存和準(zhǔn)確性。那些完全存儲編碼器網(wǎng)絡(luò)特征映射的架構(gòu)執(zhí)行得最好,但在推理期間消耗更多內(nèi)存。另一方面,SegNet更高效,因?yàn)樗淮鎯μ卣饔成涞淖畲蟪厮饕?#xff0c;并將其用于解碼器網(wǎng)絡(luò)中,以獲得良好的性能。在大型和知名的數(shù)據(jù)集上,SegNet表現(xiàn)得很有競爭力,在道路場景理解方面獲得了高分。深度分割體系結(jié)構(gòu)的端到端學(xué)習(xí)是一個(gè)比較困難的挑戰(zhàn),我們希望看到更多的人關(guān)注這個(gè)重要的問題。
論文
Vijay Badrinarayanan, Alex Kendall, Roberto Cipolla.
SegNet: A Deep Convolutional Encoder-Decoder Architecture for Image Segmentation,
IEEE Transactions on Pattern Analysis and Machine Intelligence ( Volume: 39 , Issue: 12 , Dec. 1 2017 )
https://arxiv.org/abs/1511.00561
《SegNet: A Deep Convolutional Encoder-Decoder Architecture for Image Segmentation》
arXiv地址:https://arxiv.org/abs/1511.00561?context=cs
PDF地址:https://arxiv.org/pdf/1511.00561.pdf
Vijay Badrinarayanan, Kendall, and Roberto Cipolla(2015): SegNet: A Deep Convolutional Encoder-Decoder Architecture for Image Segmentation. arXiv preprint arXiv:1511.00561 (2015).
0、實(shí)驗(yàn)結(jié)果
1、定性比較——在CamVidday和dusk測試樣品上的實(shí)驗(yàn)結(jié)果? ?
? ? ? ?Results on CamVidday and dusk test samples,幾個(gè)測試樣的圖像,包括白天和傍晚。對比的算法包括SegNet、FCN、FCN(learn deconv)、DeconvNet算法,只有SegNet算法給出了比較好的分割效果。
2、定量比較——在CamVid11道路類分割問題上,將SegNet與傳統(tǒng)方法進(jìn)行定量比較
Quantitative comparisons of SegNet with traditional methods on the CamVid11 road class segmentation problem
SegNet outperforms all the other methods, including those using depth, video and/or CRF’s on the majority of classes.
SegNet的單獨(dú)IU得分都比較高,最后的mean IU可達(dá)到60.1%。都優(yōu)于所有其他方法,包括那些在大多數(shù)類上使用深度、視頻和/或CRF的方法。
1、SegNet算法的關(guān)鍵思路
1、An illustration of the SegNet architecture. There are no fully connected layers and hence it is only convolutional. A decoder upsamples its input using the transferred pool indices from its encoder to produce a sparse feature map(s). It then performs convolution with a trainable filter bank to densify the feature map. The final decoder output feature maps are fed to a soft-max classifier for pixel-wise classification.
2、An illustration of SegNet and FCN [2] decoders. a, b, c, d correspond to values in a feature map. SegNet uses the max pooling indices to upsample (without learning) the feature map(s) and convolves with a trainable decoder filter bank. FCN upsamples by learning to deconvolve the input feature map and adds the corresponding encoder feature map to produce the decoder output. This feature map is the output of the max-pooling layer (includes sub-sampling) in the corresponding encoder. Note that there are no trainable decoder filters in FCN.
SegNet圖像分割算法的架構(gòu)詳解
更新……
SegNet圖像分割算法的案例應(yīng)用
更新……
總結(jié)
以上是生活随笔為你收集整理的DL之SegNet:SegNet图像分割/语义分割算法的简介(论文介绍)、架构详解、案例应用等配图集合之详细攻略的全部內(nèi)容,希望文章能夠幫你解決所遇到的問題。
- 上一篇: CV之ICG:计算机视觉之图像标题生成(
- 下一篇: 成功解决python中出现的Indent