这是yolov7的一个label的txt文件:
1 0.500 0.201 1.000 0.091
2 0.500 0.402 1.000 0.150
3 0.500 0.604 1.000 0.093
0 0.500 0.804 1.000 0.217
对应的样本:
长宽比分别是:1/0.091=10.98, 1/0.150=6.67, 1/0.093=10.75, 1/0.217=4.61
计算anchor的程序:
import utils.autoanchor as autoAC
# 对数据集重新计算 anchors
new_anchors = autoAC.kmean_anchors('D:\实验室\论文\论文-多信号参数估计\实验\YOLOv7\yolov7-main\zzc-multisignals-dataset-yolov7.yaml', 4, 416, 11, 1000, True)
print(new_anchors)
其中,4代表聚类出9种锚框,416代表默认的图片大小,10表示数据集中标注框宽高比的最大阈值,1000代表kmean聚类算法迭代计算1000次。
一开始报错了:
C:\Users\14115\.conda\envs\yolov7\python.exe "D:\实验室\论文\论文-多信号参数估计\实验\YOLOv7\yolov7-main\calculate anchors.py"
Scanning 'D:\english\yolov7\datasets_higher_cut\train.cache' images and labels... 400 found, 0 missing, 0 empty, 0 corrupted: 100%|██████████| 400/400 [00:00<?, ?it/s]
D:\实验室\论文\论文-多信号参数估计\实验\YOLOv7\yolov7-main\utils\autoanchor.py:125: RuntimeWarning: divide by zero encountered in divide
k, dist = kmeans(wh / s, n, iter=30) # points, mean distance
Traceback (most recent call last):
File "D:\实验室\论文\论文-多信号参数估计\实验\YOLOv7\yolov7-main\calculate anchors.py", line 4, in <module>
new_anchors = autoAC.kmean_anchors('D:\实验室\论文\论文-多信号参数估计\实验\YOLOv7\yolov7-main\zzc-multisignals-dataset-yolov7.yaml', 4, 416, 11, 1000, True)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "D:\实验室\论文\论文-多信号参数估计\实验\YOLOv7\yolov7-main\utils\autoanchor.py", line 125, in kmean_anchors
k, dist = kmeans(wh / s, n, iter=30) # points, mean distance
^^^^^^^^^^^^^^^^^^^^^^^^^^
File "C:\Users\14115\.conda\envs\yolov7\Lib\site-packages\scipy\_lib\_util.py", line 440, in wrapper
return fun(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^
File "C:\Users\14115\.conda\envs\yolov7\Lib\site-packages\scipy\cluster\vq.py", line 467, in kmeans
obs = _asarray(obs, xp=xp, check_finite=check_finite)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "C:\Users\14115\.conda\envs\yolov7\Lib\site-packages\scipy\_lib\_array_api.py", line 193, in _asarray
_check_finite(array, xp)
File "C:\Users\14115\.conda\envs\yolov7\Lib\site-packages\scipy\_lib\_array_api.py", line 109, in _check_finite
raise ValueError(msg)
ValueError: array must not contain infs or NaNs
autoanchor: Running kmeans for 4 anchors on 1600 points...
进程已结束,退出代码为 1
发现问题出在yolov7-main/utils/autoanchor.py里kmean_anchors中用标准差归一化上:
s = wh.std(0) # sigmas for whitening
k, dist = kmeans(wh / s, n, iter=30)
wh
array([[ 322.4, 23.079],
[ 322.4, 38.049],
[ 322.4, 23.703],
...,
[ 322.4, 26.198],
[ 322.4, 34.931],
[ 322.4, 25.574]])
wh.shape
(1600, 2)
s
array([ 0, 8.5888])
可以看到,因为其中一个维度标准差为0,导致按正常归一化方法就会报错。那就检测0元素,赋一个较小值:
s[s == 0] = 1e-8
运行结果:
说明我的多信号时频图数据适合用这几个anchor:
[[ 322.6 26.134]
[ 323.99 32.985]
[ 322 40.793]
[ 322.72 47.953]]
或者......如果数据集样本宽高比差不多的话,自己估摸着样本的宽高比设计anchor,在默认anchors的基础上按比例调整
默认anchor:
# anchors
anchors:
- [12,16, 19,36, 40,28] # P3/8
- [36,75, 76,55, 72,146] # P4/16
- [142,110, 192,243, 459,401] # P5/32
我的样本宽高比达大概在4:1至11:1 ,所以我自己估摸着修改anchor数值:
# anchors
anchors:
- [20,10, 20,8, 20,4] # P3/8 640->80 416->52
- [80,40, 80,16, 80,8] # P4/16 640->40 416->26
- [300,100, 300,60, 300,30] # P5/32 640->20 416->13
这么设置出问题了.....
设置只在竖直方向进行非极大值抑制。首先定位非极大值抑制函数:
不过这样找到的函数未必一定运行到这,通过断点找非极大值抑制函数更准:
找到了非极大值抑制函数:
def non_max_suppression(prediction, conf_thres=0.25, iou_thres=0.45, classes=None, agnostic=False, multi_label=False,
labels=()):
"""Runs Non-Maximum Suppression (NMS) on inference results
Returns:
list of detections, on (n,6) tensor per image [xyxy, conf, cls]
"""
nc = prediction.shape[2] - 5 # number of classes
xc = prediction[..., 4] > conf_thres # candidates
# Settings
min_wh, max_wh = 2, 4096 # (pixels) minimum and maximum box width and height
max_det = 300 # maximum number of detections per image
max_nms = 30000 # maximum number of boxes into torchvision.ops.nms()
time_limit = 10.0 # seconds to quit after
redundant = True # require redundant detections
multi_label &= nc > 1 # multiple labels per box (adds 0.5ms/img)
merge = False # use merge-NMS
t = time.time()
output = [torch.zeros((0, 6), device=prediction.device)] * prediction.shape[0]
for xi, x in enumerate(prediction): # image index, image inference
# Apply constraints
# x[((x[..., 2:4] < min_wh) | (x[..., 2:4] > max_wh)).any(1), 4] = 0 # width-height
x = x[xc[xi]] # confidence
# Cat apriori labels if autolabelling
if labels and len(labels[xi]):
l = labels[xi]
v = torch.zeros((len(l), nc + 5), device=x.device)
v[:, :4] = l[:, 1:5] # box
v[:, 4] = 1.0 # conf
v[range(len(l)), l[:, 0].long() + 5] = 1.0 # cls
x = torch.cat((x, v), 0)
# If none remain process next image
if not x.shape[0]:
continue
# Compute conf
if nc == 1:
x[:, 5:] = x[:, 4:5] # for models with one class, cls_loss is 0 and cls_conf is always 0.5,
# so there is no need to multiplicate.
else:
x[:, 5:] *= x[:, 4:5] # conf = obj_conf * cls_conf
# Box (center x, center y, width, height) to (x1, y1, x2, y2)
#这里LFM,SFM的概率就远高于BPSK,Frank了
box = xywh2xyxy(x[:, :4])
# Detections matrix nx6 (xyxy, conf, cls)
if multi_label:
i, j = (x[:, 5:] > conf_thres).nonzero(as_tuple=False).T
x = torch.cat((box[i], x[i, j + 5, None], j[:, None].float()), 1)
else: # best class only
conf, j = x[:, 5:].max(1, keepdim=True)
x = torch.cat((box, conf, j.float()), 1)[conf.view(-1) > conf_thres]
# Filter by class
if classes is not None:
x = x[(x[:, 5:6] == torch.tensor(classes, device=x.device)).any(1)]
# Apply finite constraint
# if not torch.isfinite(x).all():
# x = x[torch.isfinite(x).all(1)]
# Check shape
#这里只剩下LFM,SFM类了
n = x.shape[0] # number of boxes
if not n: # no boxes
continue
elif n > max_nms: # excess boxes
x = x[x[:, 4].argsort(descending=True)[:max_nms]] # sort by confidence
# Batched NMS
c = x[:, 5:6] * (0 if agnostic else max_wh) # classes
boxes, scores = x[:, :4] + c, x[:, 4] # boxes (offset by class), scores
i = torchvision.ops.nms(boxes, scores, iou_thres) # NMS
if i.shape[0] > max_det: # limit detections
i = i[:max_det]
if merge and (1 < n < 3E3): # Merge NMS (boxes merged using weighted mean)
# update boxes as boxes(i,4) = weights(i,n) * boxes(n,4)
iou = box_iou(boxes[i], boxes) > iou_thres # iou matrix
weights = iou * scores[None] # box weights
x[i, :4] = torch.mm(weights, x[:, :4]).float() / weights.sum(1, keepdim=True) # merged boxes
if redundant:
i = i[iou.sum(1) > 1] # require redundancy
output[xi] = x[i]
if (time.time() - t) > time_limit:
print(f'WARNING: NMS time limit {time_limit}s exceeded')
break # time limit exceeded
return output
有一段很关键的话:
i = torchvision.ops.nms(boxes, scores, iou_thres) # NMS
如果我们只在竖直方向进行非极大值抑制的话,把boxes中x1,x2分别设置为图片最左边和最右边就好了,这样计算的IOU是不考虑水平方向的。
注意,下面限制NMS的句子加的位置不对:
# Batched NMS
c = x[:, 5:6] * (0 if agnostic else max_wh) # classes
boxes, scores = x[:, :4] + c, x[:, 4] # boxes (offset by class), scores
boxes[:,0]=0
boxes[:, 2] = 450
i = torchvision.ops.nms(boxes, scores, iou_thres) # NMS box的数值和x是不一致的
必须加在+c前
+c是使得NMS可以考虑不同类别
正常的boxes:
+c以后再限制NMS的boxes:
最终的结果非常完美了:
我的另一篇博客记录了早期的实验现象: