【ACM MM2021】Cross-modality Discrepant Interaction Network for RGB-D Salient Object Detection

时间：2022-10-03 12:01:40浏览次数：54

标签：Interaction Network level Discrepant depth RGB interaction modality

【MM2021】Cross-modality Discrepant Interaction Network for RGB-D Salient Object Detection

代码： https:// rmcong.github.io/proj_CDINet.html

1、研究动机

这是来自北京交通大学丛润民老师组的工作，解决的问题是 RGB-D 显著性物体检测。当前的RGB-D 分割工作，主要分为两个类别：(1) Unidirectional interaction mode shown in Figure 1(a), which uses the depth cues as auxiliary information to supplement the RGB branch. (2) Undifferentiated bidirectional interaction shown in Figure 1(b), which treats RGB and depth cues equally to achieve cross-modality interaction.

请添加图片描述

作者提出一个问题： since these two modalities have their own strong points, can we design a discrepant interaction mode for RGB-D SOD based on their roles to make full use of the advantages of both？

作者指出，和之前方法观点不同，we believe that the interaction of the two modalities information should be carried out in a separate and discrepant manner. The low-level RGB features can help the depth features to distinguish different object instances at the same depth level, while the high-level depth features can further enrich the RGB semantics and suppress background interference. Therefore, a perfect RGB-D SOD model should give full play to the advantages of each modality, and simultaneously utilize another modality to make up for itself to avoid causing interference. 框架如上图(c)所示。

2、 Cross-modality Discrepant Interaction Network (CDINet)

CDINet 框架如下图所示，encoder有五层，前两层用RGB信息增强 depth，后面三层用 depth 增强RGB。包括三个核心部分：RGB-induced detail enhancemnet （RDE），Depth-induced semantic enhancement （DSE），Dense decoding reconstruction（DDR）.

请添加图片描述

**1）RGB-induced detail enhancemnet. ** 在低级编码器特征中，深度特征包含更详细信息（例如边界和形状），可以提供比 RGB 特征更好的细节表示，这有利于初始特征学习。因此，在网络浅层设计RDE模块，通过低层中的 RGB 特征来增强 depth。该模块如下图所示，本质上是对 depth做注意力，不过在初始输入上将RGB直接与 depth 特征拼接。

2）Depth-induced semantic enhancement. 在网络的深层，网络要学习更多的语义信息，RGB里的语义信息比depth更加全面。作者设计了两种交互模式实现特征融合：attention level 和 feature level。attention level 比较容易理解，就是得到每一个位置的权重。feature level 使用了一些其它论文里的attention方法，叫做 cascaded attention，最后与其它分支结果相加。

3）Dense decoding reconstruction. 因为这个任务本质上是做逐像素的预测，encoder部分会获得高层语义信息，但容易丢失空间细节。作者设计了一种密集的结构，以更加全面的引入之前的信息。

3、实验分析

论文有趣的实验是，作者把 encoder 中的5层，都用 depth 增强 RGB，称为 NO.2 。然后，研究把两种模态双向交互，称为 NO.3。除个别指标外，性能不如作者所提出的 CDINet。同时，NO.3 的计算量、网张参数量均有显著增加，也证明了作者所提模型的有效性。

标签：Interaction,Network,level,Discrepant,depth,RGB,interaction,modality
From： https://www.cnblogs.com/gaopursuit/p/16750253.html

Axiomatic Attribution for Deep Networks解读
摘要这篇文章主要研究的是如何将模型的预测归因到模型的输入上这个问题。作者提出了两条归因方法应该满足的基本定理，敏感性（Sensitivity）和实现不变性（ImplementationIn......
Networkx的入门使用
创建一个图创建一个没有边edge和节点node的空图：importnetworkxasnxG=nx.Graph()定义：图形是节点和已识别的节点对的集合，即顶点和边（链接）的集合。在networkx中，节点......
A Unified View on Graph Neural Networks as Graph Signal Denoising
目录概符号说明FeaturetransformationandaggregationGCNGATPPNPGraphSignalDenoisingGCNPPNPGATUGNN代码MaY.,LiuX.,ZhaoT.,LiuY.,TangJ.andShahN.Au......
study the docker network of macvlan
Introduce:在Macvlan出现之前，我们只能为一块以太网卡添加多个IP地址，却不能添加多个MAC地址，因为MAC地址正是通过其全球唯一性来标识一块以太网卡的，即便你使用了创......
配置本地yum源，安装network-scripts
(1).新建配置文件/etc/yum.repos.d/dvd/repo (2).挂载iOS映像文件（保证/Media文件存在） (3).清理缓存并建立元数据缓存dnfcleanalldnfmakecache （4）.查......
Wallys//Industrial_Wireless_AP//network,card//QCN6024/QCN9074/QCN9024,WiFi,Card
QCN9074WiFiCardIPQ6010,802.11ax,2x22.4G&5G,SupportOpenWRT//QCN9074802.11ax4x4MU-MIMO6GHzwifi6EQCN9074 11ax 4x46G M.2 1.DR9074-6E(PN02......
Dynamic Network Embedding by Modeling Triadic Closure Process
目录概符号说明Motivation本文方法代码ZhouL.,YangY.,RenX.,WuF.andZhuangY.Dynamicnetworkembeddingbymodelingtriadicclosureprocess.InAAAIConf......
SCPNet: Spatial-Channel Parallelism Network for Joint Holistic and Partial Perso
摘要：完整图像的行人重识别在过去几年里获得了广泛研究并取得了长足的进步。然而，在现实场景中，行人往往会被物体或他人遮挡，造成partial的行人识别变得困难。本文提出了空间维......
BBN: Bilateral-Branch Network with Cumulative Learning for Long-Tailed Visual Re
BBN:Bilateral-BranchNetworkwithCumulativeLearningforLong-TailedVisualRecognition论文详细解读简介在现实生活中的影像辨识的任务中大多的资料分布都不......
【2022】【Reinforcement learning in urban network traffic signal control: A syst
本篇综述主要介绍两个或多个路口路网的基于强化学习的交通信号灯控制，覆盖了1994年至2022年来自20个国家的160多篇文章。具体内容有：AreviewonReinforcementLearning......

【ACM MM2021】Cross-modality Discrepant Interaction Network for RGB-D Salient Object Detection

1、研究动机

2、 Cross-modality Discrepant Interaction Network (CDINet)

3、实验分析

相关文章

赞助商

阅读排行