NMS算法的大致过程可以看原文这段话:
-
First, it sorts all detection boxes on the basis of their scores. The detection box M with the maximum score is selected and all other detection boxes with a significant overlap (using a pre-defined threshold) with M are suppressed. This process is recursively applied on the remaining boxes.
-
那么传统的NMS算法存在什么问题呢?可以看Figure1。在Fiugre1中,检测算法本来应该输出两个框,但是传统的NMS算法可能会把score较低的绿框过滤掉(如果绿框和红框的IOU大于设定的阈值就会被过滤掉),导致只检测出一个object(一个马),显然这样object的recall就比较低了。
-
可以看出NMS算法是略显粗暴(hard),因为NMS直接将和得分最大的box的IOU大于某个阈值的box的得分置零,那么有没有soft一点的方法呢?这就是本文提出Soft NMS。那么Soft-NMS算法到底是什么样呢?简单讲就是:An algorithm which decays the detection scores of all other objects as a continuous function of their overlap with M. 换句话说就是用稍低一点的分数来代替原有的分数,而不是直接置零。另外由于Soft NMS可以很方便地引入到object detection算法中,不需要重新训练原有的模型,因此这是该算法的一大优点。
-
Figure2是Soft NMS算法的伪代码。首先是关于三个输入B、S、Nt,在FIgure2中已经介绍很清楚了。D集合用来放最终的box,在boxes集合B非空的前提下,搜索score集合S中数值最大的数,假设其下标为m,那么bm(也是M)就是对应的box。然后将M和D集合合并,并从B集合中去除M。再循环集合B中的每个box,这个时候就有差别了,如果是传统的NMS操作,那么当B中的box bi和M的IOU值大于阈值Nt,那么就从B和S中去除该box;如果是Soft NMS,则对于B中的box bi也是先计算其和M的IOU,然后该IOU值作为函数f()的输入,最后和box bi的score si相乘作为最后该box bi的score。就是这么简单!
————————————————
版权声明:本文为CSDN博主「AI之路」的原创文章,遵循CC 4.0 BY-SA版权协议,转载请附上原文出处链接及本声明。
原文链接:javascript:void(0)
#!/usr/bin/env python3
# -*- coding: utf-8 -*-
"""
Created on Wed Jun 16 17:16:31 2021
@author: ledi
"""
# -*- coding:utf-8 -*-
# Author:Richard Fang
"""
This is a Python version used to implement the Soft NMS algorithm.
Original Paper:Improving Object Detection With One Line of Code
"""
import numpy as np
import tensorflow as tf
import tensorflow as K
import time
def py_cpu_softnms(dets, sc, Nt=0.3, sigma=0.5, thresh=0.001, method=2):
"""
py_cpu_softnms
:param dets: boexs 坐标矩阵 format [y1, x1, y2, x2]
:param sc: 每个 boxes 对应的分数
:param Nt: iou 交叠门限
:param sigma: 使用 gaussian 函数的方差
:param thresh: 最后的分数门限
:param method: 使用的方法
:return: 留下的 boxes 的 index
"""
# indexes concatenate boxes with the last column
N = dets.shape[0] # 5
indexes = np.array([np.arange(N)]) # array([[0, 1, 2, 3, 4, 5]])
'''
dets=array([[200., 200., 400., 400., 0.],
[220., 220., 420., 420., 1.],
[200., 240., 400., 440., 2.],
[240., 200., 440., 400., 3.],
[ 1., 1., 2., 2., 4.]])
'''
dets = np.concatenate((dets, indexes.T), axis=1) #
# the order of boxes coordinate is [y1,x1,y2,x2]
y1 = dets[:, 0]
x1 = dets[:, 1]
y2 = dets[:, 2]
x2 = dets[:, 3]
scores = sc # array([0.9, 0.5, 0. , 0. , 0. ], dtype=float32)
areas = (x2 - x1 + 1) * (y2 - y1 + 1)
for i in range(N):
# intermediate parameters for later parameters exchange
#当前变量
tBD = dets[i, :].copy()
tscore = scores[i].copy()
tarea = areas[i].copy()
pos = i + 1
#判断是不是最后一个box
if i != N-1:
#除去第一个box 的score 最大score 和其相应的index(位置)
maxscore = np.max(scores[pos:], axis=0)
maxpos = np.argmax(scores[pos:], axis=0)
else:
maxscore = scores[-1]
maxpos = 0
#比较当前的score 和余下的score的最大值
#如果当前的score 比余下的score的最大值小,就交换对应的(x1,y1,x2,y2),score和area
#总之只有一点就是把score 最大的那个box 放在第一个
if tscore < maxscore:
#将首位的box 换成,score最大的那个box
#交换坐标 (x1,y1,x2,y2)
dets[i, :] = dets[maxpos + i + 1, :]
dets[maxpos + i + 1, :] = tBD
tBD = dets[i, :]
#交换score
scores[i] = scores[maxpos + i + 1]
scores[maxpos + i + 1] = tscore
tscore = scores[i]
#交换面积
areas[i] = areas[maxpos + i + 1]
areas[maxpos + i + 1] = tarea
tarea = areas[i]
# IoU calculate
#分别比较当前的box的某个坐标与其他坐标的最大与最小值,
#为计算交并比做准备
xx1 = np.maximum(dets[i, 1], dets[pos:, 1])
yy1 = np.maximum(dets[i, 0], dets[pos:, 0])
xx2 = np.minimum(dets[i, 3], dets[pos:, 3])
yy2 = np.minimum(dets[i, 2], dets[pos:, 2])
w = np.maximum(0.0, xx2 - xx1 + 1)
h = np.maximum(0.0, yy2 - yy1 + 1)
inter = w * h
#交并比
ovr = inter / (areas[i] + areas[pos:] - inter)
# print('ovr=',ovr)
#传统的NMS算法与soft-NMS算法的比较差别就是weight 的计算方式不同,
#NMS算法比较粗暴,直接 kill 掉 交并比大于阈值的box
#而soft-NMS则将大于阈值的box降低一定的权重
# Three methods: 1.linear 2.gaussian 3.original NMS
if method == 1: # linear
weight = np.ones(ovr.shape)
weight[ovr > Nt] = weight[ovr > Nt] - ovr[ovr > Nt]
elif method == 2: # gaussian
weight = np.exp(-(ovr * ovr) / sigma)
else: # original NMS
weight = np.ones(ovr.shape)
weight[ovr > Nt] = 0
scores[pos:] = weight * scores[pos:]
print('weight=',weight)
# print('scores=',scores)
# select the boxes and keep the corresponding indexes
inds = dets[:, 4][scores > thresh]
keep = inds.astype(int)
return keep
def speed():
boxes =1000* np.random.rand(1000, 100, 4)
boxscores = np.random.rand(1000, 100)
start = time.time()
for i in range(1000):
py_cpu_softnms(boxes[i], boxscores[i], method=2)
end = time.time()
print("Average run time: %f ms" % (end-start))
def test():
# boxes and scores
boxes = np.array([[200, 200, 400, 400], [220, 220, 420, 420], [200, 240, 400, 440], [240, 200, 440, 400], [1, 1, 2, 2]], dtype=np.float32)
boxscores = np.array([0.9, 0.8, 0.7, 0.6, 0.5], dtype=np.float32)
# tf.image.non_max_suppression 中 boxes 是 [y1,x1,y2,x2] 排序的。
index = py_cpu_softnms(boxes, boxscores, method=2)
selected_boxes = boxes[index]
return selected_boxes
if __name__ == '__main__':
test()
# speed()