深度学习论文: ENet: A Deep Neural Network Architecture for Real-Time Semantic Segmentation及其PyTorch实现

网友投稿 1059 2022-08-31

深度学习论文: ENet: A Deep Neural Network Architecture for Real-Time Semantic Segmentation及其PyTorch实现

深度学习论文: ENet: A Deep Neural Network Architecture for Real-Time Semantic Segmentation及其PyTorch实现

ENet: A Deep Neural Network Architecture for Real-Time Semantic Segmentation PDF:​​​​​概述

ENet是16年初的一篇工作了,能够达到实时的语义分割,包括在嵌入式设备NVIDIA TX1,同时还能够保证网络的效果。

2 Network architecture

2-1 ENet initial block

PyTorch代码:

class InitialBlock(nn.Module): def __init__(self,in_channels,out_channels): super(InitialBlock, self).__init__() self.conv = nn.Conv2d(in_channels, out_channels-in_channels, kernel_size=3, stride=2,padding=1, bias=False) self.pool = nn.MaxPool2d(kernel_size=3,stride=2,padding=1) self.bn = nn.BatchNorm2d(out_channels) self.relu = nn.PReLU() def forward(self, x): return self.relu(self.bn(torch.cat([self.conv(x),self.pool(x)],dim=1)))

2-2 ENet bottleneck module

下采样的bottleneck: 主线包括三个卷积层,

先是2×2投影做降采样;然后是卷积(有三种可能,Conv普通卷积,asymmetric分解卷积,Dilated空洞卷积)后面再接一个1×1的做升维

注意每个卷积层后均接Batch Norm和PReLU。 辅线包括最大池化和Padding层

最大池化负责提取上下文信息Padding负责填充通道,达到后续的残差融合

融合后再接PReLU。

非下采样的bottleneck: 主线包括三个卷积层,

class RegularBottleneck(nn.Module): def __init__(self,in_places,places, stride=1, expansion = 4,dilation=1,is_relu=False,asymmetric=False,p=0.01): super(RegularBottleneck, self).__init__() mid_channels = in_places // expansion self.bottleneck = nn.Sequential( Conv1x1BNReLU(in_places, mid_channels, False), AsymmetricConv(mid_channels, 1, is_relu) if asymmetric else Conv3x3BNReLU(mid_channels, mid_channels, 1,dilation, is_relu), Conv1x1BNReLU(mid_channels, places,is_relu), nn.Dropout2d(p=p) ) self.relu = nn.ReLU(inplace=True) if is_relu else nn.PReLU() def forward(self, x): residual = x out = self.bottleneck(x) out += residual out = self.relu(out) return outclass DownBottleneck(nn.Module): def __init__(self,in_places,places, stride=2, expansion = 4,is_relu=False,p=0.01): super(DownBottleneck, self).__init__() mid_channels = in_places // expansion self.bottleneck = nn.Sequential( Conv2x2BNReLU(in_places, mid_channels, is_relu), Conv3x3BNReLU(mid_channels, mid_channels, 1, 1, is_relu), Conv1x1BNReLU(mid_channels, places,is_relu), nn.Dropout2d(p=p) ) self.downsample = nn.MaxPool2d(3,stride=stride,padding=1,return_indices=True) self.relu = nn.ReLU(inplace=True) if is_relu else nn.PReLU() def forward(self, x): out = self.bottleneck(x) residual,indices = self.downsample(x) n, ch, h, w = out.size() ch_res = residual.size()[1] padding = torch.zeros(n, ch - ch_res, h, w) residual = torch.cat((residual, padding), 1) out += residual out = self.relu(out) return out, indicesclass UpBottleneck(nn.Module): def __init__(self,in_places,places, stride=2, expansion = 4,is_relu=True,p=0.01): super(UpBottleneck, self).__init__() mid_channels = in_places // expansion self.bottleneck = nn.Sequential( Conv1x1BNReLU(in_places,mid_channels,is_relu), TransposeConv3x3BNReLU(mid_channels,mid_channels,stride,is_relu), Conv1x1BNReLU(mid_channels,places,is_relu), nn.Dropout2d(p=p) ) self.upsample_conv = Conv1x1BN(in_places, places) self.upsample_unpool = nn.MaxUnpool2d(kernel_size=2) self.relu = nn.ReLU(inplace=True) if is_relu else nn.PReLU() def forward(self, x, indices): out = self.bottleneck(x) residual = self.upsample_conv(x) residual = self.upsample_unpool(residual,indices) out += residual out = self.relu(out) return

2-3 ENet architecture

Stage 1: encoder阶段。包括5个bottleneck,第一个bottleneck做下采样,后面4个重复的bottleneck

Stage 2-3: encoder阶段。stage2的bottleneck2.0做了下采样,后面有时加空洞卷积,或分解卷积。stage3没有下采样,其他都一样。

Stage 4~5: 属于decoder阶段。比较简单,一个上采样配置两个普通的bottleneck模型架构在任何投影上都没有使用bias,这样可以减少内核调用和存储操作。在每个卷积操作中使用Batch Norm。encoder阶段是使用padding配合max pooling做下采样。在decoder时使用max unpooling配合空洞卷积完成上采样

PyTorch代码:

class ENet(nn.Module): def __init__(self, num_classes): super(ENet, self).__init__() self.initialBlock = InitialBlock(3,16) self.stage1_1 = DownBottleneck(16, 64, 2) self.stage1_2 = nn.Sequential( RegularBottleneck(64, 64, 1), RegularBottleneck(64, 64, 1), RegularBottleneck(64, 64, 1), RegularBottleneck(64, 64, 1), ) self.stage2_1 = DownBottleneck(64, 128, 2) self.stage2_2 = nn.Sequential( RegularBottleneck(128, 128, 1), RegularBottleneck(128, 128, 1, dilation=2), RegularBottleneck(128, 128, 1, asymmetric=True), RegularBottleneck(128, 128, 1, dilation=4), RegularBottleneck(128, 128, 1), RegularBottleneck(128, 128, 1, dilation=8), RegularBottleneck(128, 128, 1, asymmetric=True), RegularBottleneck(128, 128, 1, dilation=16), ) self.stage3 = nn.Sequential( RegularBottleneck(128, 128, 1), RegularBottleneck(128, 128, 1, dilation=2), RegularBottleneck(128, 128, 1, asymmetric=True), RegularBottleneck(128, 128, 1, dilation=4), RegularBottleneck(128, 128, 1), RegularBottleneck(128, 128, 1, dilation=8), RegularBottleneck(128, 128, 1, asymmetric=True), RegularBottleneck(128, 128, 1, dilation=16), ) self.stage4_1 = UpBottleneck(128, 64, 2, is_relu=True) self.stage4_2 = nn.Sequential( RegularBottleneck(64, 64, 1, is_relu=True), RegularBottleneck(64, 64, 1, is_relu=True), ) self.stage5_1 = UpBottleneck(64, 16, 2, is_relu=True) self.stage5_2 = RegularBottleneck(16, 16, 1, is_relu=True) self.final_conv = nn.ConvTranspose2d(in_channels=16, out_channels=num_classes, kernel_size=3, stride=2, padding=1, output_padding=1, bias=False) def forward(self, x): x = self.initialBlock(x) x,indices1 = self.stage1_1(x) x = self.stage1_2(x) x, indices2 = self.stage2_1(x) x = self.stage2_2(x) x = self.stage3(x) x = self.stage4_1(x, indices2) x = self.stage4_2(x) x = self.stage5_1(x, indices1) x = self.stage5_2(x) out = self.final_conv(x) return

3 Results

版权声明:本文内容由网络用户投稿,版权归原作者所有,本站不拥有其著作权,亦不承担相应法律责任。如果您发现本站中有涉嫌抄袭或描述失实的内容,请联系我们jiasou666@gmail.com 处理,核实后本网站将在24小时内删除侵权内容。

上一篇:python协程示例
下一篇:国内首款 | Go语言微服务框架发布!(国内首款内生安全交换芯片)
相关文章

 发表评论

暂时没有评论,来抢沙发吧~