CVPR workshops-2017
code:
- https://github.com/limbee/NTIRE2017/tree/master
- https://github.com/sanghyun-son/EDSR-PyTorch
文章目录
- 1 Background and Motivation
- 2 Related Work
- 3 Advantages / Contributions
- 4 Method
- 5 Experiments
- 6 Conclusion(own)
1 Background and Motivation
single image super-resolution(SISR)aims to reconstruct a high-resolution image I S R I^{SR} ISR from a single low-resolution image I L R I^{LR} ILR
I L R I^{LR} ILR 和 I S R I^{SR} ISR 的关系因应用场景而异,比如 bicubic downsampled 关系,blur,decimation(抽取) or noise
现有方法要么网络结构设计的不太稳定,要么 treat super-resolution of different scale factors as independent problems
作者设计了 single-scale SR model——enhanced deep super-resolution network(EDSR),和 multi-scale deep super-resolution system (MDSR)
2 Related Work
learn mapping functions between I L R I^{LR} ILR and I H R I^{HR} IHR image pairs
learning methods from neighbor embedding to sparse coding
第一篇用 DCNN 做 SR 的《Learning a deep convolutional network for image super-resolution》(ICCV-2014)
encoder
3 Advantages / Contributions
提出 single-scale SR model EDSR(改进了 residual block)和 multi-scale SR model MDSR(新结构)
Our proposed single-scale and multi-scale models have achieved the top ranks in both the standard benchmark datasets and the DIV2K dataset.
4 Method
4.1 Residual blocks
applying ResNet architecture directly to low-level vision problems like super-resolution can be suboptimal.
作者学习 SRResNet 去掉了 skip connection 后的 ReLU,在此基础上,又去掉了 BN,去掉 BN 的理由如下:
they get rid of range flexibility from networks by normalizing the features, it is better to remove them
4.2 Single-scale model
upsample 是 x2 or x3 or x4,根据任务要求来
深度为 B 宽度为 F 的网络 occupy O ( B F ) O(BF) O(BF) memory with O ( B F 2 ) O(BF^2) O(BF2)
提升宽度可以显著提升 model capacity,但是 inception-v4 中观察到 if filters number(width) 超过 1000,网络迭代后会慢慢死掉(average pooling 之前的 layer的输出为0),作者加大宽度的时候也会遇到这个问题,解决办法, Scaling of the Residuals——【Inception-v4】《Inception-v4, Inception-ResNet and the Impact of Residual Connections on Learning》
factor 0.1
作者用 x2 的模型作为预训练,来训练 x4 的,效果比直接从头开始训练 x4 的要好
4.3 Multi-scale model
不知道 Single-scale model 的基础上直接多尺度输出,效果 OK 不
这样多少有点冗余的感觉
训练的时候,对应任务部分才训练,其余部分都冻住,例如训练 x2 SR 的时候,x3 和 x4 相关 resblock 被冻结
construct the minibatch with a randomly selected scale among ×2, ×3 and ×4.
Only the modules that correspond to the selected scale are enabled and updated
Baseline 参数量比较小,MDSR 中等,EDSR 设计的比较大
单个 MDSR 网络肯定是大的,但比起 3个同等规模的 EDSR 来说,会省下一些参数量
5 Experiments
we use the RGB input patches of size 48×48 from LR image with the corresponding HR patches.
5.1 Datasets and Metrics
Datasets
- DIV2K:2K resolution
- Set5
- Set14
- B100
- Urban100
- NTIRE 2017 Super-Resolution Challenge
评价指标
- peak signal-to-noise ratio (PSNR)
- SSIM
5.2 Geometric Self-ensemble
就是 TTA——test time augmentation 吧
作者用 flip 和 rotation 产生另外 7 种包含原始输入共 8 种 inputs,推理后, inverse transform to those output images,最后 8个结果平均
eg 顺时针旋转 30 度的图 SR 后,逆时针转 30 回来
上面表达式中
n
n
n 表示输入图片的索引,i 是 transformation 的索引
LR = low resolution
SR = super resolution
论文中如果用了 Geometric Self-ensemble,模型名称后面会有个 +
5.3 Evaluation on DIV2K Dataset
作者用 L1 loss 替换 L2 loss,从第一列和第二列结果对比来看,L1 要好
其实抛开参数量谈效果,很流氓,但作者也说了这篇文章的 motivation,this work is initially proposed for the purpose of participating in the NTIRE2017 Super-Resolution Challenge——技术报告
5.4 Benchmark Results
效果对比起来看,确实 OK
5.5 NTIRE2017 SR Challenge
6 Conclusion(own)
- applying ResNet architecture directly to low-level vision problems like super-resolution can be suboptimal.
- geometric self-ensemble is valid only for symmetric downsampling methods such as bicubic downsampling