SoftPool算法详解
Refining activation downsampling with SoftPool-論文鏈接-代碼鏈接
目錄
- 1、需求解讀
- 2、SoftPool算法簡介
- 3、SoftPool算法詳解
- 3.1 池化算法變種
- 3.2 SoftPool計算
- 4、SoftPool代碼實現
- 5、SoftPool效果展示與分析
- 5.1、SoftPool主觀效果展示與分析
- 5.2、SoftPool客觀效果展示與分析
- 6、總結與分析
- 參考資料
- 注意事項
1、需求解讀
??在各種各樣的計算機視覺任務中,都可以看到池化層的身影。自從2012年深度學習火熱起來之后,池化層就伴隨著卷積層等一起出現。我們經常會看到卷積層的各種討論和各種改進版本,很少有學者們和工程師門關注池化層。
??當前,我們常用的池化層主要包含兩種類型,具體包括:最大池化層和平均池化層,前者取特定區域中的最大值作為輸出,后者取特定區域中的平均值作為輸出。池化層的主要作用包括:(1)保留主要特征的同時減少計算量,降低特征的冗余度,防止模型過擬合;(2)保持變換不變形,包括平移、尺度和旋轉不變性。
??大量的實驗結果表明,這兩種池化操作在池化的同時會丟失圖像中的大多數信息,降低了整個網絡的性能。為了盡可能減少池化操作過程中的信息的損失,SoftPool池化操作應運而生。
2、SoftPool算法簡介
??SoftPool是一種變種的池化層,它可以在保持池化層功能的同時盡可能減少池化過程中帶來的信息損失。如下圖所示,P1、P2、P3和P4表示原圖上的一個2*2大小的區域,首先利用公式將P1、P2、P3和P4轉換成藍色的區域;然后將將綠色區域的2*2矩陣與藍色區域的2*2進行相乘與相加,從而獲得最終的結果。
epi∑i=14epi\frac{e^{pi}}{\sum_{i=1}^{4}e^{pi}} ∑i=14?epiepi?
3、SoftPool算法詳解
3.1 池化算法變種
??上圖展示了多個變種的池化層,具體包括Average Pooling、Max Pooling、Power Average Pooling、Stochastic Pooling、S3 Pooling、Local Importance Pooling與SoftPool。通過觀察我們可以發現:(1)其它的池化操作基本都是在最大池化或者平均池化的變種;(2)S3池化操作的思路與最大池化類似;(3)其它的池化操作基本都是平均池化的變種;(4)Local Importance Pooling與SoftPool池化操作的思路類似,都給原圖的區域計算了對應的區域,并進行了累計操作。
3.2 SoftPool計算
??上圖展示了SoftPool操作的Forward階段與Backward階段,6*6大小的區域表示的是激活映射a。
??前向計算的步驟包括:(1)計算候選的3*3區域的權重w;(2)將權重w與激活映射a相乘相加獲得a~\tilde{a}a~。
??反向計算的步驟包括:(1)計算a~\tilde{a}a~的梯度值▽a~\bigtriangledown \tilde{a}▽a~;(2)將 ▽a~\bigtriangledown \tilde{a}▽a~與權重w相乘獲得▽a\bigtriangledown {a}▽a。
4、SoftPool代碼實現
soft_pool1d代碼實現如下所示:
''' --- S T A R T O F F U N C T I O N S O F T _ P O O L 1 D ---[About]Function for dowsampling based on the exponenial proportion rate of pixels (soft pooling).If the tensor is in CUDA the custom operation is used. Alternatively, the function usesstandard (mostly) in-place PyTorch operations for speed and reduced memory consumption.It is also possible to use non-inplace operations in order to improve stability.[Args]- x: PyTorch Tensor, could be in either cpu of CUDA. If in CUDA the homonym extension is used.- kernel_size: Integer or Tuple, for the kernel size to be used for downsampling. If an `Integer`is used, a `Tuple` is created for the rest of the dimensions. Defaults to 2.- stride: Integer or Tuple, for the steps taken between kernels (i.e. strides). If `None` thestrides become equal to the `kernel_size` tuple. Defaults to `None`.- force_inplace: Bool, determines if in-place operations are to be used regardless of the CUDAcustom op. Mostly useful for time monitoring. Defaults to `False`.[Returns]- PyTorch Tensor, subsampled based on the specified `kernel_size` and `stride` ''' def soft_pool1d(x, kernel_size=2, stride=None, force_inplace=False):if x.is_cuda and not force_inplace:x = CUDA_SOFTPOOL1d.apply(x, kernel_size, stride)# Replace `NaN's if foundif torch.isnan(x).any():return torch.nan_to_num(x)return xkernel_size = _single(kernel_size)if stride is None:stride = kernel_sizeelse:stride = _single(stride)# Get input sizes_, c, d = x.size()# Create per-element exponential value sum : Tensor [b x c x d]e_x = torch.exp(x)# Apply mask to input and pool and calculate the exponential sum# Tensor: [b x c x d] -> [b x c x d']return F.avg_pool1d(x.mul(e_x), kernel_size, stride=stride).mul_(sum(kernel_size)).div_(F.avg_pool1d(e_x, kernel_size, stride=stride).mul_(sum(kernel_size))) '''soft_pool2d代碼實現如下所示:
''' --- S T A R T O F F U N C T I O N S O F T _ P O O L 2 D ---[About]Function for dowsampling based on the exponenial proportion rate of pixels (soft pooling).If the tensor is in CUDA the custom operation is used. Alternatively, the function usesstandard (mostly) in-place PyTorch operations for speed and reduced memory consumption.It is also possible to use non-inplace operations in order to improve stability.[Args]- x: PyTorch Tensor, could be in either cpu of CUDA. If in CUDA the homonym extension is used.- kernel_size: Integer or Tuple, for the kernel size to be used for downsampling. If an `Integer`is used, a `Tuple` is created for the rest of the dimensions. Defaults to 2.- stride: Integer or Tuple, for the steps taken between kernels (i.e. strides). If `None` thestrides become equal to the `kernel_size` tuple. Defaults to `None`.- force_inplace: Bool, determines if in-place operations are to be used regardless of the CUDAcustom op. Mostly useful for time monitoring. Defaults to `False`.[Returns]- PyTorch Tensor, subsampled based on the specified `kernel_size` and `stride` ''' def soft_pool2d(x, kernel_size=2, stride=None, force_inplace=False):if x.is_cuda and not force_inplace:x = CUDA_SOFTPOOL2d.apply(x, kernel_size, stride)# Replace `NaN's if foundif torch.isnan(x).any():return torch.nan_to_num(x)return xkernel_size = _pair(kernel_size)if stride is None:stride = kernel_sizeelse:stride = _pair(stride)# Get input sizes_, c, h, w = x.size()# Create per-element exponential value sum : Tensor [b x c x h x w]e_x = torch.exp(x)# Apply mask to input and pool and calculate the exponential sum# Tensor: [b x c x h x w] -> [b x c x h' x w']return F.avg_pool2d(x.mul(e_x), kernel_size, stride=stride).mul_(sum(kernel_size)).div_(F.avg_pool2d(e_x, kernel_size, stride=stride).mul_(sum(kernel_size))) '''soft_pool3d代碼實現如下所示:
''' --- S T A R T O F F U N C T I O N S O F T _ P O O L 3 D ---[About]Function for dowsampling based on the exponenial proportion rate of pixels (soft pooling).If the tensor is in CUDA the custom operation is used. Alternatively, the function usesstandard (mostly) in-place PyTorch operations for speed and reduced memory consumption.It is also possible to use non-inplace operations in order to improve stability.[Args]- x: PyTorch Tensor, could be in either cpu of CUDA. If in CUDA the homonym extension is used.- kernel_size: Integer or Tuple, for the kernel size to be used for downsampling. If an `Integer`is used, a `Tuple` is created for the rest of the dimensions. Defaults to 2.- stride: Integer or Tuple, for the steps taken between kernels (i.e. strides). If `None` thestrides become equal to the `kernel_size` tuple. Defaults to `None`.- force_inplace: Bool, determines if in-place operations are to be used regardless of the CUDAcustom op. Mostly useful for time monitoring. Defaults to `False`.[Returns]- PyTorch Tensor, subsampled based on the specified `kernel_size` and `stride` ''' def soft_pool3d(x, kernel_size=2, stride=None, force_inplace=False):if x.is_cuda and not force_inplace:x = CUDA_SOFTPOOL3d.apply(x, kernel_size, stride)# Replace `NaN's if foundif torch.isnan(x).any():return torch.nan_to_num(x)return xkernel_size = _triple(kernel_size)if stride is None:stride = kernel_sizeelse:stride = _triple(stride)# Get input sizes_, c, d, h, w = x.size()# Create per-element exponential value sum : Tensor [b x c x d x h x w]e_x = torch.exp(x)# Apply mask to input and pool and calculate the exponential sum# Tensor: [b x c x d x h x w] -> [b x c x d' x h' x w']return F.avg_pool3d(x.mul(e_x), kernel_size, stride=stride).mul_(sum(kernel_size)).div_(F.avg_pool3d(e_x, kernel_size, stride=stride).mul_(sum(kernel_size))) '''5、SoftPool效果展示與分析
5.1、SoftPool主觀效果展示與分析
??上圖展示了SoftPool在一些測試圖片上面的具體效果,為了客觀的進行比較,作者對比了該算法與Max池化與Avg池化的效果,具體的細節請看原圖。通過觀察我們可以得出以下的初步結論:(1)與原圖相比,SoftPool操作能夠保留原圖中更多的細節,Avg池化次之,Max池化丟失的信息最多;(2)從計算復雜度來講,SoftPool的復雜度最高,Avg池化次之,Max池化最低。
5.2、SoftPool客觀效果展示與分析
??上表展示了5個大小不同的kernel和SoftPool的前向和反向運行時間。通過觀察我們可以得出以下的初步結論:(1)在CPU設備上面,Avg池化最快,SoftPool池化次之,Max池化最慢;(2)在CUDA上面,Avg池化最快,SoftPool池化次之,Max池化最慢;(3)從內存占用率角度而言,Avg池化占用的內存空間最小,SoftPool次之,Max池化占用的內存空間最多。
??上表展示了利用SoftPool替換掉原始的池化層之后在多個分類模型上面的分類精度。通過觀察我們可以得出以下初步的結論:(1)SoftPool層在不同的分類模型中的top1與top5精度都極大的超越了原始的池化操作;(2)對于ResNet網絡架構而言,隨著參數量的不斷增加,GFLOP也得到了對應的提升。
6、總結與分析
??SoftPool是一種變種的池化層,它可以在保持池化層功能的同時盡可能減少池化過程中帶來的信息損失。大量的實驗結果表明該算法的性能優于原始的Avg池化與Max池化。
??隨著神經網絡的設計變得越來越困難,而通過NAS等方法也幾乎不能大幅度提升算法的性能,為了打破這個瓶頸,從基礎的網絡層優化入手,不失為一種可靠有效的精度提升手段,一旦提出,可以將其擴展到多個不同的計算機視覺任務中。
參考資料
[1] 原始論文
注意事項
[1] 該博客是本人原創博客,如果您對該博客感興趣,想要轉載該博客,請與我聯系(qq郵箱:1575262785@qq.com),我會在第一時間回復大家,謝謝大家的關注。
[2] 由于個人能力有限,該博客可能存在很多的問題,希望大家能夠提出改進意見。
[3] 如果您在閱讀本博客時遇到不理解的地方,希望您可以聯系我,我會及時的回復您,和您交流想法和意見,謝謝。
[4] 本人業余時間承接各種本科畢設設計和各種項目,包括圖像處理(數據挖掘、機器學習、深度學習等)、matlab仿真、python算法及仿真等,有需要的請加QQ:1575262785詳聊,備注“項目”!!!
總結
以上是生活随笔為你收集整理的SoftPool算法详解的全部內容,希望文章能夠幫你解決所遇到的問題。
- 上一篇: RepVGG算法详解
- 下一篇: vs2008激活、序列号