Resnet的pytorch官方实现代码解读
Resnet的pytorch官方實現代碼解讀
目錄
- Resnet的pytorch官方實現代碼解讀
- 前言
- 概述
- 34層網絡結構的“平原”網絡與“殘差”網絡的結構圖對比
- 不同結構的resnet的網絡架構設計
- resnet代碼細節分析
前言
pytorch官方給出了現在的常見的經典網絡的torch版本實現。仔細看看這些網絡結構的實現,可以發現官方給出的代碼比較精簡,大部分致力于實現最樸素結構,沒有用很多的技巧,在網絡結構之外的分組卷積、膨脹卷積等等技巧已經略去(分組數目設置為1,膨脹系數設置為1),為理解網絡結構略去了很多不必要的麻煩。
概述
34層網絡結構的“平原”網絡與“殘差”網絡的結構圖對比
先給出一個34層的神經網絡,并比較加入殘差前后的結構。
不同結構的resnet的網絡架構設計
resnet有很多不一樣的結構變種,大體上的框架沒有變化,都是在原來的“直通”式結構的基礎上加上跳躍連接,變化的是網絡層數是參數量。
resnet代碼細節分析
這個系列的博客致力于分析代碼的細節,略過網絡結構的優缺點理論分析,下面講講resnet是具體怎么操作的。
import torch from torch import Tensor import torch.nn as nn from .utils import load_state_dict_from_url from typing import Type, Any, Callable, Union, List, Optional # 所有可用的網絡模型的名稱 __all__ = ['ResNet', 'resnet18', 'resnet34', 'resnet50', 'resnet101','resnet152', 'resnext50_32x4d', 'resnext101_32x8d','wide_resnet50_2', 'wide_resnet101_2'] # 預訓練權重的下載地址,通過model_urls這個字典的key獲取對應的鍵值value,也就是下載地址 model_urls = {'resnet18': 'https://download.pytorch.org/models/resnet18-5c106cde.pth','resnet34': 'https://download.pytorch.org/models/resnet34-333f7ec4.pth','resnet50': 'https://download.pytorch.org/models/resnet50-19c8e357.pth','resnet101': 'https://download.pytorch.org/models/resnet101-5d3b4d8f.pth','resnet152': 'https://download.pytorch.org/models/resnet152-b121ed2d.pth','resnext50_32x4d': 'https://download.pytorch.org/models/resnext50_32x4d-7cdf4587.pth','resnext101_32x8d': 'https://download.pytorch.org/models/resnext101_32x8d-8ba56ff5.pth','wide_resnet50_2': 'https://download.pytorch.org/models/wide_resnet50_2-95faca4d.pth','wide_resnet101_2': 'https://download.pytorch.org/models/wide_resnet101_2-32ee1156.pth',} # 將conv2d封裝起來,其中第一個參數代表input_channels,第二個參數代表output_channels # 對conv2d的封裝其實作用不大,也可以不封裝,直接輸入輸入維度、輸出維度、卷積核大小、 # 步長、補零值、分組卷積數量、膨脹卷積數量、偏置的值 def conv3x3(in_planes:int,out_planes:int, stride:int = 1, groups:int = 1, dilation: int = 1)->nn.conv2d:return nn.conv2d(in_planes,out_planes,kernel_size = 3,stride = stride, padding = dilation, groups = groups,bias = False,dilation = dilation)def conv1x1(in_planes:int,out_planes:int, stride:int = 1)->nn.conv2d:return nn.conv2d(in_planes,out_planes,kernel_size = 1,stride = stride, bias = False) # 定義基礎版的參數模塊,由兩個3x3的卷積疊加而成 class BasicBlock(nn.Module):expansion: int = 1def __init__(self,inplanes:int,planes:int,stride:int = 1,downsample:Optional[nn.Module] = None,# 分組卷積,這里設置groups = 1,也就是不分組卷積groups:int = 1,base_width:int = 64,# 膨脹卷積的膨脹系數設為1,也就是不膨脹卷積dilation:int = 1,norm_layer:Optional[Callable[..,nn.Module]] = None)->None:# 超父類,習慣上在一個類的開頭寫上super(class_name, self).__init__()來進行初始化super(BasicBlock,self).__init__()# 加上batch_normlization層if norm_layer is None:norm_layer = nn.BatchNorm2d# 異常判斷if groups != 1 or base_width != 64:raise ValueError('BasicBlock only supports groups=1 and base_width=64')if dilation > 1:raise NotImplementError('Dilation > 1 not supported in BasicBlock')# 最普通的conv、bn、relu流程self.conv1 = conv3x3(inplanes,planes,stride)self.bn1 = norm_layer(planes)self.relu = nn.ReLU(inplace = True)self.conv2 = conv3x3(planes,planes)self.bn2 = norm_layer(planes)self.downsample = downsampleself.stride = stridedef forward(self,x:Tensor)->Tensor:indentity = xout = self.conv1(x)out = self.bn1(out)out = self.relu(out)out = self.conv2(out)out = self.bn2(out)# 如果出現下采樣,要對輸入的數據做上采樣(乘以expansion系數),這樣才能保證輸入輸出的維度一致,# 這樣子兩個tensor才能夠相加if downsample is not None:identity = self.downsample(x)# 加上輸入out+=identityout = self.relu(out)return out # 加強版的殘差網絡,與基礎版本的不同之處在于從兩個3x3卷積變成了 # 1x1、3x3、1x1卷積,前一個1x1卷積用來壓縮維度,后一個1x1卷積用來恢復維度 # 卷積的時候都把bias設成False了,這是因為每次卷積之后都會做一個bn的操作, # 經過bn操作,會生成新的數據分布,偏置項的作用將被消除掉,所以不用加偏置 class Bottleneck(nn.Module):expansion: int = 4def __init__(self,inplanes:int,planes:int,stride:int = 1,downsample:Optional[nn.Module] = None,groups:int = 1,base_width:int = 64,dilation:int = 1,norm_layer:Optional[Callable[..,nn.Module]] = None)->None:super(BasicBlock,self).__init__()if norm_layer is None:norm_layer = nn.BatchNorm2dwidth = int(planes*(base_width/64.))*groupsself.conv1 = conv1x1(inplanes,width)self.bn1 = norm_layer(width)self.conv2 = conv3x3(width,width,stride,groups,dilation)self.bn2 = norm_layer(width)self.conv3 = conv1x1(width,width,stride,groups,dilation)self.bn3 = norm_layer(planes*self.expansion)self.relu = nn.ReLU(inplace=True)self.downsample = downsampleself.stride = stridedef forward(self,x:Tensor)->Tensor:indentity = xout = self.conv1(x)out = self.bn1(out)out = self.relu(out)out = self.conv2(out)out = self.bn2(out)out = self.relu(out)out = self.conv3(out)out = self.bn3(out)# 如果出現下采樣,要對輸入的數據做上采樣(乘以expansion系數),這樣才能保證輸入輸出的維度一致,# 這樣子兩個tensor才能夠相加if downsample is not None:identity = self.downsample(x)out+=identityout = self.relu(out)return outclass Resnet:def __init__(self,block:Type[Union[[BasicBlock,Bottleneck]],layers:List[int],num_classes:int = 1000,zero_init_residual:bool = False,groups: int = 1,width_per_group:int = 64,replace_stride_with_dilation:Optional[List[bool]] = None,norm_layer = Optional[Callable[..,nn.Module]] = None)->None:super(Resnet,self).__init__()if norm_layer is None:norm_layer = nn.BatchNorm2dself._norm_layer = norm_layerself.inplanes = 64self.dilation = 1if replace_stride_with_dilation is None:replace_stride_with_dilation = [False,False,False]if len(replace_stride_with_dilation)!=3:raise ValueError("replace_stride_with_dilation should be None ""or a 3-element tuple, got {}".format(replace_stride_with_dilation))self.groups = groupsself.base_width = width_per_groupself.conv1 = nn.Conv2d(3,self.inplanes,kernel_size = 7,stride = 2, padding = 3,bias = False)self.bn1 = norm_layer(self.inplanes)self.relu = nn.ReLU(inplace = True)self.maxpool = nn.MaxPool2d(kernel_size = 3,stride = 2,padding = 1)self.layer1 = self._make_layer(block,64,layers[0])self.layer2 = self._make_layer(block,128,layers[1],stride = 2, dilate = replace_stride_with_dilation[0])self.layer3 = self._make_layer(block,256,layers[2],stride = 2, dilate = replace_stride_with_dilation[1])self.layer4 = self._make_layer(block,512,layers[3],stride = 2, dilate = replace_stride_with_dilation[2])self.avgpool = nn.AdaptiveAvgPool2d((1,1))self.fc = nn.Linear(512*block.expansion,num_classes)for m in self.modules():if isinstance(m,nn.Conv2d):nn.init.kaiming_normal_(m.weight, mode = 'fan_out',nonlinearity='relu')elif isinstance(m,(nn.BatchNorm2d,nn.GroupNorm)):nn.init.constant_(m.weight,1)nn.init.constant_(m.bias,0)if zero_init_residual:for m in self.modules():if isinstance(m,Bottleneck):nn.init.constant_(m.bn3.weight,0)elif isinstance(m,BasicBlock):nn.init.constant_(m.bn2.weight,0)def _make_layer(self,block:Type[Union[BasicBlock,Bottleneck]],planes:int, blocks:int,stride:int = 1,dilate:bool = False)->nn.Sequential:norm_layer = self._norm_layerdownsample = Noneprevious_dilation = self.dilationif dilate:self.dilation*=stridestride = 1if stride != 1 or self.inplanes != planes*block.expansion:# downsample就是一堆1x1的卷積核接bn操作,目的是為了讓殘差連接的tensor的通道數能夠對應的上downsample = nn.Sequential(conv1x1(self.inplanes,planes*block.expansion,stride),norm_layer(planes*block.expansion),)layers = []layers.append(block(self.inplanes,planes,stride,downsample,self.groups,self.base_width,previous_dilation,norm_layer))self.inplanes = planes*block.expansionfor _ in range(1,blocks):layers.append(block(self.inplanes,planes,groups = self.groups,base_width = self.base_width, dilation = self.dilation ,norm_layer))# 將網絡結構打包成Sequential格式返回的好處在于不用每次都# 在函數里面寫conv、relu、maxpool的操作,直接打包成一個功能列表送進去# *layers,單個*號表示這個位置接收任意多個非關鍵字參數,并且轉化成列表return nn.Sequential(*layers)def _forward_impl(self,x:Tensor)->Tensor:x = self.conv1(x)x = self.bn1(x)x = self.relu(x)x = self.maxpool(x)x = self.layer1(x)x = self.layer2(x)x = self.layer3(x)x = self.layer4(x)x = self.avgpool(x)x = x.view(x.size(0),-1)x = self.fc(x)return xdef forward(self,x:Tensor)->Tensor:return self._forward_impl(x)# 輸入basicblock或bottleneck以及layers參數,返回resnet網絡結構 def _resnet(arch:str,block:Type[Union[BasicBlock,Bottleneck]],layers:List[int],pretrained:bool,progress:bool,**kwargs:Any )->Resnet:model = Resnet(block,layers,**kwargs)if pretrained:state_dict = load_state_dict_from_url(model_urls[arch],progress = progress)model.load_state_dict_from_url(state_dict)return model # 預定義的resnet不同結構 def resnet18(pretrained:bool = False, progress: bool = True, **kwargs: Any)->ResNet:return _resnet('resnet18',BasicBlock,[2,2,2,2],pretrained, progress, **kwargs) def resnet34(pretrained: bool = False, progress: bool = True, **kwargs: Any) -> ResNet:r"""ResNet-34 model from`"Deep Residual Learning for Image Recognition" <https://arxiv.org/pdf/1512.03385.pdf>`_.Args:pretrained (bool): If True, returns a model pre-trained on ImageNetprogress (bool): If True, displays a progress bar of the download to stderr"""return _resnet('resnet34', BasicBlock, [3, 4, 6, 3], pretrained, progress,**kwargs)def resnet50(pretrained: bool = False, progress: bool = True, **kwargs: Any) -> ResNet:r"""ResNet-50 model from`"Deep Residual Learning for Image Recognition" <https://arxiv.org/pdf/1512.03385.pdf>`_.Args:pretrained (bool): If True, returns a model pre-trained on ImageNetprogress (bool): If True, displays a progress bar of the download to stderr"""return _resnet('resnet50', Bottleneck, [3, 4, 6, 3], pretrained, progress,**kwargs)def resnet101(pretrained: bool = False, progress: bool = True, **kwargs: Any) -> ResNet:r"""ResNet-101 model from`"Deep Residual Learning for Image Recognition" <https://arxiv.org/pdf/1512.03385.pdf>`_.Args:pretrained (bool): If True, returns a model pre-trained on ImageNetprogress (bool): If True, displays a progress bar of the download to stderr"""return _resnet('resnet101', Bottleneck, [3, 4, 23, 3], pretrained, progress,**kwargs)def resnet152(pretrained: bool = False, progress: bool = True, **kwargs: Any) -> ResNet:r"""ResNet-152 model from`"Deep Residual Learning for Image Recognition" <https://arxiv.org/pdf/1512.03385.pdf>`_.Args:pretrained (bool): If True, returns a model pre-trained on ImageNetprogress (bool): If True, displays a progress bar of the download to stderr"""return _resnet('resnet152', Bottleneck, [3, 8, 36, 3], pretrained, progress,**kwargs)def resnext50_32x4d(pretrained: bool = False, progress: bool = True, **kwargs: Any) -> ResNet:r"""ResNeXt-50 32x4d model from`"Aggregated Residual Transformation for Deep Neural Networks" <https://arxiv.org/pdf/1611.05431.pdf>`_.Args:pretrained (bool): If True, returns a model pre-trained on ImageNetprogress (bool): If True, displays a progress bar of the download to stderr"""kwargs['groups'] = 32kwargs['width_per_group'] = 4return _resnet('resnext50_32x4d', Bottleneck, [3, 4, 6, 3],pretrained, progress, **kwargs)def resnext101_32x8d(pretrained: bool = False, progress: bool = True, **kwargs: Any) -> ResNet:r"""ResNeXt-101 32x8d model from`"Aggregated Residual Transformation for Deep Neural Networks" <https://arxiv.org/pdf/1611.05431.pdf>`_.Args:pretrained (bool): If True, returns a model pre-trained on ImageNetprogress (bool): If True, displays a progress bar of the download to stderr"""kwargs['groups'] = 32kwargs['width_per_group'] = 8return _resnet('resnext101_32x8d', Bottleneck, [3, 4, 23, 3],pretrained, progress, **kwargs)def wide_resnet50_2(pretrained: bool = False, progress: bool = True, **kwargs: Any) -> ResNet:r"""Wide ResNet-50-2 model from`"Wide Residual Networks" <https://arxiv.org/pdf/1605.07146.pdf>`_.The model is the same as ResNet except for the bottleneck number of channelswhich is twice larger in every block. The number of channels in outer 1x1convolutions is the same, e.g. last block in ResNet-50 has 2048-512-2048channels, and in Wide ResNet-50-2 has 2048-1024-2048.Args:pretrained (bool): If True, returns a model pre-trained on ImageNetprogress (bool): If True, displays a progress bar of the download to stderr"""kwargs['width_per_group'] = 64 * 2return _resnet('wide_resnet50_2', Bottleneck, [3, 4, 6, 3],pretrained, progress, **kwargs)def wide_resnet101_2(pretrained: bool = False, progress: bool = True, **kwargs: Any) -> ResNet:r"""Wide ResNet-101-2 model from`"Wide Residual Networks" <https://arxiv.org/pdf/1605.07146.pdf>`_.The model is the same as ResNet except for the bottleneck number of channelswhich is twice larger in every block. The number of channels in outer 1x1convolutions is the same, e.g. last block in ResNet-50 has 2048-512-2048channels, and in Wide ResNet-50-2 has 2048-1024-2048.Args:pretrained (bool): If True, returns a model pre-trained on ImageNetprogress (bool): If True, displays a progress bar of the download to stderr"""kwargs['width_per_group'] = 64 * 2return _resnet('wide_resnet101_2', Bottleneck, [3, 4, 23, 3],pretrained, progress, **kwargs)resnet18每一層的輸入輸出的tensor的shape到底是多少呢,輸入是3x224x224的tensor,將輸出的tensor打印如下:
----------------------------------------------------------------Layer (type) Output Shape Param # ================================================================Conv2d-1 [-1, 64, 112, 112] 9,408BatchNorm2d-2 [-1, 64, 112, 112] 128ReLU-3 [-1, 64, 112, 112] 0MaxPool2d-4 [-1, 64, 56, 56] 0Conv2d-5 [-1, 64, 56, 56] 36,864BatchNorm2d-6 [-1, 64, 56, 56] 128ReLU-7 [-1, 64, 56, 56] 0Conv2d-8 [-1, 64, 56, 56] 36,864BatchNorm2d-9 [-1, 64, 56, 56] 128ReLU-10 [-1, 64, 56, 56] 0BasicBlock-11 [-1, 64, 56, 56] 0Conv2d-12 [-1, 64, 56, 56] 36,864BatchNorm2d-13 [-1, 64, 56, 56] 128ReLU-14 [-1, 64, 56, 56] 0Conv2d-15 [-1, 64, 56, 56] 36,864BatchNorm2d-16 [-1, 64, 56, 56] 128ReLU-17 [-1, 64, 56, 56] 0BasicBlock-18 [-1, 64, 56, 56] 0Conv2d-19 [-1, 128, 28, 28] 73,728BatchNorm2d-20 [-1, 128, 28, 28] 256ReLU-21 [-1, 128, 28, 28] 0Conv2d-22 [-1, 128, 28, 28] 147,456BatchNorm2d-23 [-1, 128, 28, 28] 256Conv2d-24 [-1, 128, 28, 28] 8,192BatchNorm2d-25 [-1, 128, 28, 28] 256ReLU-26 [-1, 128, 28, 28] 0BasicBlock-27 [-1, 128, 28, 28] 0Conv2d-28 [-1, 128, 28, 28] 147,456BatchNorm2d-29 [-1, 128, 28, 28] 256ReLU-30 [-1, 128, 28, 28] 0Conv2d-31 [-1, 128, 28, 28] 147,456BatchNorm2d-32 [-1, 128, 28, 28] 256ReLU-33 [-1, 128, 28, 28] 0BasicBlock-34 [-1, 128, 28, 28] 0Conv2d-35 [-1, 256, 14, 14] 294,912BatchNorm2d-36 [-1, 256, 14, 14] 512ReLU-37 [-1, 256, 14, 14] 0Conv2d-38 [-1, 256, 14, 14] 589,824BatchNorm2d-39 [-1, 256, 14, 14] 512Conv2d-40 [-1, 256, 14, 14] 32,768BatchNorm2d-41 [-1, 256, 14, 14] 512ReLU-42 [-1, 256, 14, 14] 0BasicBlock-43 [-1, 256, 14, 14] 0Conv2d-44 [-1, 256, 14, 14] 589,824BatchNorm2d-45 [-1, 256, 14, 14] 512ReLU-46 [-1, 256, 14, 14] 0Conv2d-47 [-1, 256, 14, 14] 589,824BatchNorm2d-48 [-1, 256, 14, 14] 512ReLU-49 [-1, 256, 14, 14] 0BasicBlock-50 [-1, 256, 14, 14] 0Conv2d-51 [-1, 512, 7, 7] 1,179,648BatchNorm2d-52 [-1, 512, 7, 7] 1,024ReLU-53 [-1, 512, 7, 7] 0Conv2d-54 [-1, 512, 7, 7] 2,359,296BatchNorm2d-55 [-1, 512, 7, 7] 1,024Conv2d-56 [-1, 512, 7, 7] 131,072BatchNorm2d-57 [-1, 512, 7, 7] 1,024ReLU-58 [-1, 512, 7, 7] 0BasicBlock-59 [-1, 512, 7, 7] 0Conv2d-60 [-1, 512, 7, 7] 2,359,296BatchNorm2d-61 [-1, 512, 7, 7] 1,024ReLU-62 [-1, 512, 7, 7] 0Conv2d-63 [-1, 512, 7, 7] 2,359,296BatchNorm2d-64 [-1, 512, 7, 7] 1,024ReLU-65 [-1, 512, 7, 7] 0BasicBlock-66 [-1, 512, 7, 7] 0 ================================================================ Total params: 11,176,512 Trainable params: 11,176,512 Non-trainable params: 0 ---------------------------------------------------------------- Input size (MB): 0.57 Forward/backward pass size (MB): 62.78 Params size (MB): 42.64 Estimated Total Size (MB): 105.99 ----------------------------------------------------------------同樣輸入是3x224x224的tensor,如果是resnet50,那么就會用到bottleneck。那么什么是basicblock,什么是bottleneck呢?
如上圖所示,左圖和右圖分別是basicblock和bottleneck。從兩者的結構圖中可以發現,bottleneck相比起basicblock,在開頭和結尾的地方多了1x1的卷積,1x1的卷積的作用在于變換通道數,通過控制1x1卷積核的數量,可以方便地調整通道數的大小。
resnet50輸出的shape為:
總結
以上是生活随笔為你收集整理的Resnet的pytorch官方实现代码解读的全部內容,希望文章能夠幫你解決所遇到的問題。
- 上一篇: 人群频率 | gnomAD数据库简介 (
- 下一篇: AlexNet代码解读