注意力机制论文:Non-Local neural networks及其Pytorch实现

网友投稿 976 2022-08-31

注意力机制论文:Non-Local neural networks及其Pytorch实现

注意力机制论文:Non-Local neural networks及其Pytorch实现

Non-Local neural networks PDF: ​​​​​​​Neural Network和Non-Local Means非局部均值去噪滤波有点相似。普通的滤波都是3×3的卷积核,然后在整个图片上进行移动,处理的是3×3局部的信息。Non-Local Means操作则是结合了一个比较大的搜索范围,并进行加权。

1 概述

non-local operations通过计算任意两个位置之间的交互直接捕捉远程依赖,而不用局限于相邻点,其相当于构造了一个和特征图谱尺寸一样大的卷积核, 从而可以维持更多信息。non-local可以作为一个组件,和其它网络结构结合,用于其他视觉任务中。Non-local在视频分类上效果可观

2 Non-local operation

Non-local 操作可以表示为

其中

g函数是一个线性转换

f函数用于计算i和j相似度的函数, 文中列举中四种具体实现

Gaussian:

Embedded Gaussian:

Dot product:

Concatenation:

汇总起来就是

3 Non-local block

3-1 抽象图

3-2 细节图

4 Ablations

import torchimport torch.nn as nnimport torchvisionclass NonLocalBlock(nn.Module): def __init__(self, channel): super(NonLocalBlock, self).__init__() self.inter_channel = channel // 2 self.conv_phi = nn.Conv2d(in_channels=channel, out_channels=self.inter_channel, kernel_size=1, stride=1,padding=0, bias=False) self.conv_theta = nn.Conv2d(in_channels=channel, out_channels=self.inter_channel, kernel_size=1, stride=1, padding=0, bias=False) self.conv_g = nn.Conv2d(in_channels=channel, out_channels=self.inter_channel, kernel_size=1, stride=1, padding=0, bias=False) self.softmax = nn.Softmax(dim=1) self.conv_mask = nn.Conv2d(in_channels=self.inter_channel, out_channels=channel, kernel_size=1, stride=1, padding=0, bias=False) def forward(self, x): # [N, C, H , W] b, c, h, w = x.size() # [N, C/2, H * W] x_phi = self.conv_phi(x).view(b, c, -1) # [N, H * W, C/2] x_theta = self.conv_theta(x).view(b, c, -1).permute(0, 2, 1).contiguous() x_g = self.conv_g(x).view(b, c, -1).permute(0, 2, 1).contiguous() # [N, H * W, H * W] mul_theta_phi = torch.matmul(x_theta, x_phi) mul_theta_phi = self.softmax(mul_theta_phi) # [N, H * W, C/2] mul_theta_phi_g = torch.matmul(mul_theta_phi, x_g) # [N, C/2, H, W] mul_theta_phi_g = mul_theta_phi_g.permute(0,2,1).contiguous().view(b,self.inter_channel, h, w) # [N, C, H , W] mask = self.conv_mask(mul_theta_phi_g) out = mask + x return outif __name__=='__main__': model = NonLocalBlock(channel=16) print(model) input = torch.randn(1, 16, 64, 64) out = model(input) print(out.shape)

版权声明:本文内容由网络用户投稿,版权归原作者所有,本站不拥有其著作权,亦不承担相应法律责任。如果您发现本站中有涉嫌抄袭或描述失实的内容,请联系我们jiasou666@gmail.com 处理,核实后本网站将在24小时内删除侵权内容。

上一篇:Go语言很好很强大,但我有几个问题想吐槽(go语言的坑)
下一篇:语义分割论文:DeepLabv3+: Encoder-Decoder with Atrous Separable Convolution及其Pytorch实现
相关文章

 发表评论

暂时没有评论,来抢沙发吧~