【深度学习】计算机视觉(CV)-目标检测-SSD(Single Shot MultiBox Detector)—— 单次检测多框检测器

发布于:2025-02-15 ⋅ 阅读:(38) ⋅ 点赞:(0)

🔹 SSD(Single Shot MultiBox Detector)—— 单次检测多框检测器

1️⃣ 什么是 SSD?

SSD (Single Shot MultiBox Detector) 是一种用于 目标检测(Object Detection)深度学习模型,由 Wei Liu 等人 在 2016 年提出。
它采用 单阶段(Single Stage) 方法,能够 直接从图像中检测多个对象,并输出类别和边界框,比传统的两阶段方法(如 Faster R-CNN)更快。


2️⃣ SSD 的核心特点

单阶段检测:相比 Faster R-CNN 需要两步(提取区域 + 识别),SSD 一步 就能完成目标检测。
多尺度特征检测:在 不同层级 进行检测,以适应大、小目标。
高效的先验框(Default Boxes):类似 YOLO 的锚框(Anchor Boxes),用于提高检测精度。
轻量级计算:比 Faster R-CNN 更快,适用于 实时检测


3️⃣ SSD 的网络结构

SSD 采用 VGG16MobileNet 作为骨干网络(Backbone),然后在 不同尺度的特征图上检测目标

📌 SSD 结构分为三部分: 1️⃣ 主干网络(Backbone):通常是 VGG16 或 MobileNet,用于提取特征。
2️⃣ 多尺度检测层(Extra Feature Layers):在不同层进行检测,提高小目标的检测效果。
3️⃣ 预测层(Prediction Layers):利用 默认框(Default Boxes) 进行分类和回归。

📌 SSD 典型架构

输入图像(300x300) ➝ VGG16 提取特征 ➝ 额外卷积层 ➝ 多尺度检测 ➝ 输出目标类别和边界框


4️⃣ SSD 的核心算法

📌 1️⃣ 多尺度特征图(Feature Maps)

  • SSD 在 不同尺度 进行检测,例如:
    • conv4_3 层(大目标检测)
    • conv7 层(中等目标)
    • conv8_2 ~ conv11_2 层(小目标)
  • 这样能 同时检测不同尺寸的物体,提高检测精度。

📌 2️⃣ 默认框(Default Boxes)

  • SSD 采用 多个尺寸和纵横比的默认框 进行检测。
  • 例如,一个位置可以有多个比例(1:1、1:2、2:1)和大小的框。
  • 通过非极大值抑制(NMS)筛选最优框

📌 3️⃣ 损失函数(Loss Function) SSD 采用 多任务损失

L = L_{\text{loc}} + \alpha L_{\text{conf}}

  • 定位损失(L_loc):使用 Smooth L1 Loss 计算真实框和预测框的误差。
  • 分类损失(L_conf):使用 交叉熵(Cross Entropy) 进行类别预测。
  • 困难样本挖掘(Hard Negative Mining):平衡正负样本,防止负样本过多。

5️⃣ SSD 代码示例

使用 PyTorch 训练 SSD

import torch
import torchvision
from torchvision.models.detection import ssd300_vgg16

# 加载 SSD 预训练模型(VGG16 作为骨干网络)
model = ssd300_vgg16(pretrained=True)
model.eval()  # 设为评估模式

# 加载测试图像
image = torch.rand(1, 3, 300, 300)  # 生成一个随机图像
output = model(image)  # 进行目标检测

# 输出检测结果
print(output)

📌 输出示例

[{'boxes': tensor([[  4.3774,   0.0000, 296.1398, 296.1545],
        [  4.3993,   0.0000, 296.4670, 296.7289],
        [  7.9937,   2.4237, 294.5887, 296.1728],
        [ 69.2036,   1.6595, 224.8485,  89.6344],
        [ 26.9926,   6.7602, 121.4106, 144.2272],
        [ 92.4211,   0.0000, 229.8040, 208.3294],
        [  1.3626,  23.1578,  93.5442, 289.2806],
        [  4.3993,   0.0000, 296.4670, 296.7289],
        [ 76.6926,   5.4309, 149.0640, 156.7170],
        [ 10.3550,   6.2502, 197.3316, 181.1919],
        [106.9824,   4.5797, 182.2237, 157.4160],
        [132.2069,   8.6678, 219.1386, 144.4542],
        [ 79.0658,  30.8220, 213.8745, 120.1073],
        [142.0560,  44.9816, 300.0000, 261.8794],
        [ 43.9961,  60.6780, 113.4670, 221.8410],
        [  4.8406,   3.4960, 173.2399,  85.3488],
        [168.6355,   3.2111, 246.1957, 157.3428],
        [115.9878,  18.3186, 190.0469,  94.4698],
        [  7.9937,   2.4237, 294.5887, 296.1728],
        [  1.8401,   2.7173,  80.9552,  81.3943],
        [ 84.2810,  18.5298, 157.4316,  93.6491],
        [  4.3774,   0.0000, 296.1398, 296.1545],
        [163.0305,  19.0193, 237.7007,  94.0890],
        [140.2142,   0.0000, 290.2731,  92.2141],
        [  0.7699,  84.9703,  99.9977, 201.1923],
        [ 20.7645,   7.6457,  58.4122,  72.7975],
        [ 49.2734,  18.8153, 125.7265,  94.6529],
        [ 37.5175,   8.6355,  74.5134,  72.0357],
        [ 49.3139,  70.5232, 175.3022, 209.0820],
        [206.2103,  51.5114, 283.0901, 233.4067],
        [ 54.2698,   9.4817,  89.9449,  71.6526],
        [  4.3774,   0.0000, 296.1398, 296.1545],
        [ 68.4321,  34.4477, 140.9630, 108.9126],
        [117.4278,  83.9086, 187.3821, 157.0109],
        [ 83.9005,  68.8995, 211.3797, 152.5351],
        [  4.3724,   7.1513,  41.6993,  74.3836],
        [ 11.2119,  66.5362, 142.1151, 153.6605],
        [176.0526,   4.2302, 259.2906,  77.8738],
        [ 16.5305,  32.5297,  93.1180, 111.5275],
        [172.7833,  59.0675, 240.8657, 224.4151],
        [ 70.6702,  26.6661, 105.3486,  87.4191],
        [ 86.4020,  84.3180, 155.8031, 156.8705],
        [  3.9825,  39.6654, 164.5469, 116.8739],
        [102.0017,  99.2526, 171.5490, 173.6638],
        [  9.4001,   4.1654, 292.2263, 292.6517],
        [218.2317,  99.7507, 245.6188, 175.6490],
        [ 12.5375, 109.5155, 139.7209, 252.5660],
        [148.9126,  85.0168, 219.4572, 156.0667],
        [143.6943,   0.0000, 208.8078,  92.9759],
        [218.0947, 131.8932, 245.6517, 206.9023],
        [ 86.6296,  27.1547, 121.8965,  86.9736],
        [182.1451,  26.4790, 218.2385,  87.4392],
        [ 85.6868,  51.3363, 156.9124, 124.8844],
        [201.7919,  20.3948, 230.0636,  95.3969],
        [ 58.6690,  85.6201,  86.4400, 158.0279],
        [237.8976,  79.2133, 299.4106, 267.3134],
        [ 74.1663,  86.0775, 101.9228, 158.3806],
        [ 38.3397,  42.0953,  73.8427, 104.3351],
        [118.3740,  26.3185, 154.1633,  86.7665],
        [165.9606,  26.4949, 202.2519,  87.1502],
        [102.2428,  26.6385, 138.0445,  86.6755],
        [116.5572,  50.8512, 188.6578, 125.3926],
        [133.5854,  99.5528, 203.4103, 173.3626],
        [ 41.6178,  85.4544,  70.0365, 157.4289],
        [ 89.9596,  85.9436, 117.8041, 158.6940],
        [ 34.2198,  84.6780, 108.1625, 157.1806],
        [ 58.2119,  36.3685,  86.1674, 111.0400],
        [134.3326,  26.1657, 170.1573,  87.0323],
        [153.1971,  20.0751, 182.6458,  94.6176],
        [105.8610,  85.7882, 134.1690, 158.7567],
        [130.4643,  39.4638, 294.5816, 115.5758],
        [233.8322,  99.0858, 261.7257, 177.0587],
        [  0.0000,  46.8298,  41.8673, 248.9998],
        [218.0107,  20.5122, 245.8064,  95.4748],
        [233.4248, 131.2962, 261.9585, 207.5430],
        [ 22.0550,  41.2268,  57.4551, 104.8794],
        [121.7302,  85.4249, 150.3270, 158.4747],
        [  6.2010, 154.6621,  82.9141, 299.6869],
        [202.2049,  84.8620, 229.4331, 158.9373],
        [147.7037,  51.0245, 220.5701, 124.3972],
        [ 89.8940,  52.5901, 117.6983, 126.6781],
        [ 28.1485,   2.2970,  83.0982,  48.7126],
        [ 52.0133,  99.9635, 123.1805, 173.4247],
        [ 34.3082,  50.2485, 108.2714, 125.6584],
        [212.5328,  97.9328, 284.7024, 174.2638],
        [202.5374,   1.6573, 289.3676, 161.6270],
        [ 74.0387,  52.5599, 101.7447, 126.6867],
        [ 49.3417,  93.9316, 277.1458, 227.4363],
        [117.9579, 115.3029, 186.7402, 189.8200],
        [ 42.2042, 127.9100, 114.5919, 287.4723],
        [  9.4001,   4.1654, 292.2263, 292.6517],
        [  4.3993,   0.0000, 296.4670, 296.7289],
        [  9.4001,   4.1654, 292.2263, 292.6517],
        [ 92.4211,   0.0000, 229.8040, 208.3294],
        [  4.3774,   0.0000, 296.1398, 296.1545],
        [  9.4001,   4.1654, 292.2263, 292.6517],
        [  4.3774,   0.0000, 296.1398, 296.1545],
        [ 71.9093, 171.6278, 221.8463, 295.5115],
        [  7.9937,   2.4237, 294.5887, 296.1728],
        [  4.3774,   0.0000, 296.1398, 296.1545]], grad_fn=<StackBackward0>), 'scores': tensor([0.0638, 0.0606, 0.0548, 0.0468, 0.0463, 0.0453, 0.0450, 0.0424, 0.0402,
        0.0398, 0.0373, 0.0369, 0.0350, 0.0349, 0.0331, 0.0331, 0.0331, 0.0324,
        0.0323, 0.0315, 0.0314, 0.0308, 0.0295, 0.0286, 0.0282, 0.0276, 0.0271,
        0.0269, 0.0257, 0.0249, 0.0247, 0.0247, 0.0247, 0.0245, 0.0241, 0.0238,
        0.0237, 0.0235, 0.0234, 0.0232, 0.0227, 0.0226, 0.0225, 0.0223, 0.0222,
        0.0222, 0.0221, 0.0219, 0.0219, 0.0218, 0.0218, 0.0218, 0.0218, 0.0217,
        0.0216, 0.0214, 0.0213, 0.0213, 0.0213, 0.0212, 0.0211, 0.0210, 0.0209,
        0.0209, 0.0209, 0.0209, 0.0208, 0.0208, 0.0206, 0.0206, 0.0206, 0.0205,
        0.0203, 0.0202, 0.0202, 0.0202, 0.0202, 0.0202, 0.0200, 0.0199, 0.0195,
        0.0194, 0.0194, 0.0193, 0.0193, 0.0193, 0.0193, 0.0192, 0.0192, 0.0192,
        0.0192, 0.0173, 0.0163, 0.0130, 0.0119, 0.0119, 0.0112, 0.0111, 0.0111,
        0.0110], grad_fn=<IndexBackward0>), 'labels': tensor([61,  1, 28,  1,  1,  1,  1, 65,  1,  1,  1,  1,  1,  1,  1,  1,  1,  1,
        38,  1,  1,  5,  1,  1,  1,  1,  1,  1,  1,  1,  1, 52,  1,  1,  1,  1,
         1,  1,  1,  1,  1,  1,  1,  1, 16,  1,  1,  1,  1,  1,  1,  1,  1,  1,
         1,  1,  1,  1,  1,  1,  1,  1,  1,  1,  1,  1,  1,  1,  1,  1,  1,  1,
         1,  1,  1,  1,  1,  1,  1,  1,  1,  1,  1,  1,  1,  1,  1,  1,  1,  1,
        84,  9, 32, 52, 67, 41, 36,  5, 35, 19])}]

使用 OpenCV 进行目标检测

import cv2
import numpy as np
import torch
from torchvision.models.detection import ssd300_vgg16, SSD300_VGG16_Weights

# 使用绝对路径
image_path = r"D:\Pictures\test.jpg"

# 读取图像
image = cv2.imread(image_path)
image = cv2.resize(image, (300, 300))
image_tensor = torch.from_numpy(image.transpose(2, 0, 1)).float().unsqueeze(0)

# 加载模型
model = ssd300_vgg16(weights=SSD300_VGG16_Weights.DEFAULT)
model.eval()

# 进行预测
output = model(image_tensor)

# 解析检测结果
for box, score in zip(output[0]['boxes'], output[0]['scores']):
    if score > 0.5:  # 设定置信度阈值
        x1, y1, x2, y2 = map(int, box.tolist())
        cv2.rectangle(image, (x1, y1), (x2, y2), (0, 255, 0), 2)

# 显示检测结果
cv2.imshow("SSD Detection", image)
cv2.waitKey(0)
cv2.destroyAllWindows()

 


6️⃣ SSD vs 其他目标检测算法
模型 类型 速度(FPS) 检测精度(mAP) 优点 缺点
SSD 单阶段 45+ 🎯 74.3 速度快,多尺度检测 小目标精度较低
YOLO 单阶段 60+ 🎯 63.4 速度极快 细节检测能力较差
Faster R-CNN 双阶段 5-10 🎯 76.4 高精度 速度较慢

7️⃣ SSD 的应用

自动驾驶(Autonomous Driving) 🚗
人脸检测(Face Detection) 😃
视频监控(Surveillance) 📹
工业检测(Industrial Inspection) 🏭
智能安防(Smart Security) 🏢


8️⃣ SSD 的优化方向

🚀 改进骨干网络(如 ResNet、MobileNet),提升特征提取能力。
🚀 结合 Transformer(如 DETR),增强全局信息建模。
🚀 提高小目标检测能力(如 FPN、注意力机制)


📌 总结

SSD 是一种单阶段目标检测方法,速度快,适合实时检测。
SSD 采用多尺度特征图和默认框,提高检测精度。
相比 Faster R-CNN,SSD 速度更快,但小目标检测性能稍弱
广泛应用于自动驾驶、人脸检测、工业检测等领域

🎯 SSD 结合 YOLO 的高效性和 Faster R-CNN 的精度,使其成为实时目标检测的优秀选择! 🚀


网站公告

今日签到

点亮在社区的每一天
去签到