# Neural networks - IoU & NMS

### IoU

IoU (Intersection over Union), also known as overlap / intersection union ratio.

That is, the intersection/Union in the figure above. The code implementation is as follows:

```# one pre, one gt
def IoU(pred_box, gt_box):
ixmin = max(pred_box[0], gt_box[0])
iymin = max(pred_box[1], gt_box[1])
ixmax = min(pred_box[2], gt_box[2])
iymax = min(pred_box[3], gt_box[3])
inter_w = np.maximum(ixmax - ixmin + 1., 0)
inter_h = np.maximum(iymax - iymin + 1., 0)

inters = inter_w * inter_h

uni = ((pred_box[2] - pred_box[0] + 1.) * (pred_box[3] - pred_box[1] + 1.) +
(gt_box[2] - gt_box[0] + 1.) * (gt_box[3] - gt_box[1] + 1.) - inters)

ious = inters / uni

return ious

# multi pre, one gt
def maxIoU(pred_box, gt_box):
ixmin = np.maximum(pred_box[:, 0], gt_box[0])
iymin = np.maximum(pred_box[:, 1], gt_box[1])
ixmax = np.minimum(pred_box[:, 2], gt_box[2])
iymax = np.minimum(pred_box[:, 3], gt_box[3])
inters_w = np.maximum(ixmax - ixmin + 1., 0)  # Finding the maximum and minimum values element by element broadcasting
inters_h = np.maximum(iymax - iymin + 1., 0)  # Finding the maximum and minimum values element by element broadcasting

inters = inters_w * inters_h

uni = ((pred_box[:, 2] - pred_box[:, 0] + 1.) * (pred_box[:, 3] - pred_box[:, 1] + 1.) +
(gt_box[2] - gt_box[0] + 1.) * (gt_box[3] - gt_box[1] + 1.) - inters)

ious = inters / uni
iou = np.max(ious)
iou_id = np.argmax(ious)

return iou, iou_id

# multi pre, multi gt
def box_IoU(pred_box, gt_boxes):
result = []
for gt_box in gt_boxes:
temp = []
ixmin = np.maximum(pred_box[:, 0], gt_box[0])
iymin = np.maximum(pred_box[:, 1], gt_box[1])
ixmax = np.minimum(pred_box[:, 2], gt_box[2])
iymax = np.minimum(pred_box[:, 3], gt_box[3])
inters_w = np.maximum(ixmax - ixmin + 1., 0)  # Finding the maximum and minimum values element by element broadcasting
inters_h = np.maximum(iymax - iymin + 1., 0)  # Finding the maximum and minimum values element by element broadcasting

inters = inters_w * inters_h

uni = ((pred_box[:, 2] - pred_box[:, 0] + 1.) * (pred_box[:, 3] - pred_box[:, 1] + 1.) +
(gt_box[2] - gt_box[0] + 1.) * (gt_box[3] - gt_box[1] + 1.) - inters)

ious = inters / uni
iou = np.max(ious)
iou_id = np.argmax(ious)

temp.append(iou)
temp.append(iou_id)
result.append(temp)
return result```

And some loss functions related to IoU:

Introduction to target detection regression loss function: SmoothL1/IoU/GIoU/DIoU/CIoU Loss - Zhihu

Collection | summary of target detection regression loss function

Problems arising from data annotation: the probability distribution of the bounding box and the uncertainty of the bounding box predicted by the model.

Understanding the probability distribution of target detection bounding box - Zhihu

The Gaussian distribution of Bounding Box is modeled.

Wuhan University proposed NWD: a new paradigm of small target detection, abandoning the IOU based violence rising point (reaching the top SOTA)

### NMS

NMS (non maximum suppression). Many of the detection results output by the detection model are redundant, which is the repeated prediction of the same object. Therefore, NMS needs to be used to suppress some. The specific methods are as follows:

1. Each output of the model includes regression prediction bbox pre, classification prediction cls pre and classification prediction score cls score. All outputs of the model are divided by cls pre.
2. All outputs of each category are sorted by cls score, and the output with the highest score is taken out each time. Calculate the IoU value with all the remaining outputs, filter out all the outputs whose IoU value reaches the threshold, and do not participate in the following steps.
3. Take the second largest output of cls score from the remaining output after filtering, repeat the operation in step 2, select the third largest output, and repeat the operation in step 2 until all outputs of this category are traversed.
4. Repeat steps 2 and 3 to traverse all outputs of all categories to obtain the final detection result.

Code implementation of each category filter:

```def py_cpu_nms(dets, thresh):
"""Pure Python NMS baseline."""
x1 = dets[:, 0]                     # pred bbox top_x
y1 = dets[:, 1]                     # pred bbox top_y
x2 = dets[:, 2]                     # pred bbox bottom_x
y2 = dets[:, 3]                     # pred bbox bottom_y
scores = dets[:, 4]              # pred bbox cls score

areas = (x2 - x1 + 1) * (y2 - y1 + 1)    # pred bbox areas
order = scores.argsort()[::-1]              # Sort pred bbox in descending order by score, corresponding to step-2

keep = []    # Reserved pred bbox after NMS
while order.size > 0:
i = order[0]          # top-1 score bbox
keep.append(i)   # top-1 score is naturally retained
xx1 = np.maximum(x1[i], x1[order[1:]])   # top-1 bbox (maximum score) and the remaining bbox in order to calculate NMS
yy1 = np.maximum(y1[i], y1[order[1:]])
xx2 = np.minimum(x2[i], x2[order[1:]])
yy2 = np.minimum(y2[i], y2[order[1:]])

w = np.maximum(0.0, xx2 - xx1 + 1)
h = np.maximum(0.0, yy2 - yy1 + 1)
inter = w * h
ovr = inter / (areas[i] + areas[order[1:]] - inter)      # Ubiquitous IoU computing~~~

inds = np.where(ovr <= thresh)[0]     # This operation can be understood for code breakpoint debugging. Combined with step-3, we want to eliminate all redundant bbox with the current top-1 bbox IOU > thresh, so the retained bbox is naturally the non redundant bbox with ovr < = thresh, and its inds are retained for further screening
order = order[inds + 1]   # Keeping effective bbox is the lucky one who has not been suppressed in this round of NMS. Why + 1? Because ind = 0 is the top-1 of this round of NMS, the remaining effective bbox is calculated with top-1 in the IoU calculation. inds corresponds to the original array. Naturally, it is necessary to map + 1, followed by the loop of step-4

return keep    # Final NMS result return```

And variants of NMS:

NMS for target detection - precision improvement - Zhihu

NMS can also play tricks... - you know

Added by webspinner on Mon, 03 Jan 2022 07:33:21 +0200