DeepLab v2的摘要部分（翻译加理解）-白红宇

DeepLab v2的摘要部分（翻译加理解）

阅读量：3939 次

发布时间：2019-05-23

本文共 2493 字，大约阅读时间需要 8 分钟。

1.原文翻译

In this work we address the task of semantic image segmentation with Deep Learning and make three main contributions that are experimentally shown to have substantial practical merit.

在这个工作当中，我们聚焦深度学习当中图片语义分割问题，并做出了被实验说明有很多实质性成效的三个主要贡献。

First, we highlight convolution with upsampled filters（上采样过滤器）, or ‘atrous convolution’, as a powerful tool in dense prediction tasks.

首先，我们显著提升了卷积作为上采样滤波器的存在，或者将这个描述为空洞卷积。是一个在稠密预测任务当中有力的工具。

Atrous convolution allows us to explicitly control the resolution at which feature responses are computed within Deep Convolutional Neural Networks.

空洞卷积使得我们可以在特征被深度卷积神经网络的计算过程中，明确地控制分辨率。

It also allows us to effectively enlarge the field of view of filters to incorporate larger context without increasing the number of parameters or the amount of computation.

同样地，空洞卷积也可以是我们可以有效地提升过滤器的感受野，来在不增加参数和计算量的前提下扩大contesxt（我理解这里应该是考虑更大的范围）

Second, we propose atrous spatial pyramid pooling (ASPP) to robustly（鲁棒性） segment objects at multiple scales.*

第二点，我们建议使用ASPP，在多尺度输入的情况下，强化领域对象的鲁棒性

ASPP probes an incoming convolutional feature layer with filters at multiple sampling rates and effective fields-of-views,

ASPP探索一种带有多抽样率卷积特征层的滤波器，同时伴随着有效地感受野。

thus capturing objects（目标） as well as image context at multiple scales（多尺度）.

这里注意object不应该理解为对象，应当理解为目标。

因此在多尺度上获得了识别目标和图片上下文。

Third, we improve the localization of object boundaries by combining methods from DCNNs and probabilistic graphical

models. The commonly deployed combination of max-pooling and downsampling in DCNNs achieves invariance but has a toll on

localization accuracy. We overcome this by combining the responses at the final DCNN layer with a fully connected Conditional Random Field (CRF), which is shown both qualitatively and quantitatively to improve localization performance. Our proposed

“DeepLab” system sets the new state-of-art at the PASCAL VOC-2012 semantic image segmentation task, reaching 79.7% mIOU in

the test set, and advances the results on three other datasets: PASCAL-Context, PASCAL-Person-Part, and Cityscapes. All of our code

is made publicly available online.

2.理解部分

1.强调具有上采样滤波器的空洞卷积在密集预测任务中是很有用的工具。空洞卷经济可以准确控制深度神经网络当中的特征相应，同时可以在不增加参数和计算量的前提下提升感受野。

有点难理解，大约就是说密集任务中，这个挺有效果的，有效果的原因可以理解成，这个东西不用引入新的参数不用增加计算量就可以获得更好的感受野。

2.提出了atrous spatial pyramid pooling (ASPP) 网络，增强了在多尺度下多类别分割时的鲁棒性，使用不同的采样比例与感受野提取输入特征，能在多个尺度上捕获目标与上下文信息。

大约我们是可以将这个理解成使用空洞卷积获得不同的尺度信息来共同输入

3.通过图模型（a fully connected Conditional Random Field ，CFR(DenseCRF)）来精确确定分割的边界。

这个东西就是在模型的最后使用了一个传统的概率图模型，来有效处理语义分割的边界

转载地址：http://qkywi.baihongyu.com/

你可能感兴趣的文章