YOLOv11 目标检测

发布于:2025-03-15 ⋅ 阅读:(14) ⋅ 点赞:(0)

本文章不再赘述anaconda的下载以及虚拟环境的配置,博主使用的python版本为3.8

1.获取YOLOv11的源工程文件

链接:GitHub - ultralytics/ultralytics: Ultralytics YOLO11 🚀

直接下载解压

2.需要自己准备的文件

文件结构如下:红框部分的文件需要自己补充(见下文)

yolo11n.pt (还是在同一个链接下)

my_train.py

import warnings
warnings.filterwarnings('ignore')
from ultralytics import YOLO

if __name__ == '__main__':
    model = YOLO('ultralytics/cfg/models/11/yolo11.yaml')
    model.load('yolo11n.pt') # loading pretrain weights
    model.train(data='E:/pytroch/YOLO/yolov11/ultralytics-main-250314/ultralytics-main/my_Data.yaml',
                imgsz=640,
                epochs=100,
                batch=32,
                workers=8,
                #close_mosaic=10,
                device='cpu',
                #device='0',
                optimizer='SGD', # using SGD
                #project='runs/train',
                #name='exp',
                amp=False,
                cache=False,  # 服务器可设置为True,训练速度变快
                )

.yaml的路径要使用绝对路径

my_predict图像推理.py (之后可换成自己训练的模型)

from ultralytics import YOLO

# Load a pre-trained YOLO model (adjust model type as needed)
model = YOLO("yolo11n.pt")  # n, s, m, l, x versions available

# Perform object detection on an image
results = model.predict(source="1.jpg")  # Can also use video, directory, URL, etc.

# Display the results
results[0].show()  # Show the first image results

my_predict实时推理.py (之后可换成自己训练的模型)

from ultralytics import YOLO
import cv2
import numpy as np
import time

# 加载模型
model = YOLO("yolo11n.pt")

# 打开视频文件
#video_path = "720p.mp4"  # 替换为你的视频路径
#cap = cv2.VideoCapture(video_path)
cap = cv2.VideoCapture(0) #使用外置USB摄像头

# 检查视频是否打开成功
if not cap.isOpened():
    print("Error: Cannot open video file.")
    exit()

# 获取视频信息
fps = int(cap.get(cv2.CAP_PROP_FPS))
print(f"Video FPS: {fps}")

# 获取视频宽高
frame_width = int(cap.get(cv2.CAP_PROP_FRAME_WIDTH))
frame_height = int(cap.get(cv2.CAP_PROP_FRAME_HEIGHT))

# 实时推理并显示结果
while True:
    start_time = time.time()  # 记录帧处理开始时间

    ret, frame = cap.read()
    if not ret:
        break  # 视频结束

    # 模型推理
    results = model(frame)  # 对当前帧进行推理

    for r in results:
        # 绘制检测结果图像
        im_array = r.plot()  # 带预测结果的图像 (BGR 格式)

        # 检查是否有实例分割掩码
        if r.masks is not None:
            # 计算总像素占比
            total_pixels = frame_width * frame_height  # 图像总像素数
            masks = r.masks.data.cpu().numpy()  # 获取实例分割的掩码
            total_object_pixels = np.sum(masks)  # 所有物体的像素总和
            total_percentage = (total_object_pixels / total_pixels) * 100  # 百分比计算

            # 在图像左上角绘制百分比信息
            text = f"Total percentage: {total_percentage:.2f}%"
        else:
            text = "No objects detected"  # 没有检测到物体时的提示信息

        font = cv2.FONT_HERSHEY_SIMPLEX
        font_scale = 1.2
        color = (0, 0, 255)
        thickness = 2
        position = (20, 40)  # 文本位置 (x, y)
        im_array = cv2.putText(im_array, text, position, font, font_scale, color, thickness)

        # 显示当前帧的推理结果
        cv2.imshow("Inference", im_array)

    # 计算帧率
    elapsed_time = time.time() - start_time
    print(f"Processing time per frame: {elapsed_time:.2f} seconds ({1/elapsed_time:.2f} FPS)")

    # 按 'q' 键退出
    if cv2.waitKey(1) & 0xFF == ord('q'):
        break

# 释放资源
cap.release()
cv2.destroyAllWindows()

my_Data.yaml

train: my_Dataset\train # train images (relative to 'path') 128 images
val: my_Dataset\val  # val images (relative to 'path') 128 images
test: my_Dataset\test

nc: 4

# Classes
names: ['recycle','hazardous','foodWaste','other']

路径为数据集的路径(见下文)

names的类别顺序:与labels标签一致即可

my_Dataset存放数据集(数据集文件结构如下)

my_Dataset
|
|
|
----> test

        ---->images

                     1.jpg

                     2.jpg

                     ……

        ---->labels

                     1.txt

                     2.txt

                     ……

----> train

        ---->images

                     3.jpg

                     4.jpg

                     ……

        ---->labels

                     3.txt

                     4.txt

                     ……

----> val

        ---->images

                     5.jpg

                     6.jpg

                     ……

        ---->labels

                     5.txt

                     6.txt

                     ……

3.模型训练

在所有文件配置好后运行my_train.py

关于训练参数的设置:

翻译置:Configuration - Ultralytics YOLO Docs

YOLO 模型的训练设置包含训练过程中使用的各种超参数和配置。这些设置会影响模型的性能、速度和准确性。关键训练设置包括批量大小、学习率、动量和权重衰减。此外,优化器、损失函数和训练数据集组合的选择也会影响训练过程。仔细调整和试验这些设置对于优化性能至关重要。

参数Argument

类型

Type

默认值

Default

描述

Description

model str None

指定用于训练的模型文件。接受 .pt 预训练模型或 .yaml 配置文件的路径。对于定义模型结构或初始化权重至关重要。

Specifies the model file for training. Accepts a path to either a .pt pretrained model or a .yaml configuration file. Essential for defining the model structure or initializing weights.

data str None

数据集配置文件的路径(例如,coco8.yaml)。该文件包含特定于数据集的参数,包括训练和验证数据的路径、类名和类数。

Path to the dataset configuration file (e.g., coco8.yaml). This file contains dataset-specific parameters, including paths to training and validation data, class names, and number of classes.

epochs int 100

​训练 epoch 的总数。每个 epoch 表示整个数据集的完整传递。调整此值可能会影响训练持续时间和模型性能。

Total number of training epochs. Each epoch represents a full pass over the entire dataset. Adjusting this value can affect training duration and model performance.

time float None

最大训练时间(小时)。如果设置,这将覆盖 epochs 参数,从而允许训练在指定的持续时间后自动停止。适用于时间受限的训练场景。

Maximum training time in hours. If set, this overrides the epochs argument, allowing training to automatically stop after the specified duration. Useful for time-constrained training scenarios.

patience int 100

​在提前停止训练之前,在验证指标没有改善的情况下等待的epochs数。通过在性能停滞时停止训练来帮助防止过度拟合。

Number of epochs to wait without improvement in validation metrics before early stopping the training. Helps prevent overfitting by stopping training when performance plateaus.

batch int 16

具有三种模式:设置为整数(例如,batch=16)、GPU 内存利用率为 60% 的自动模式 (batch=-1) 或具有指定利用率分数的自动模式 (batch=0.70)。

Batch size, with three modes: set as an integer (e.g., batch=16), auto mode for 60% GPU memory utilization (batch=-1), or auto mode with specified utilization fraction (batch=0.70).

imgsz int or list 640

用于训练的目标图像大小。在输入到模型之前,所有图像都会调整到此尺寸。影响模型精度和计算复杂度。

Target image size for training. All images are resized to this dimension before being fed into the model. Affects model accuracy and computational complexity.

save bool True

支持保存训练检查点和最终模型权重。对于恢复训练或模型部署非常有用。

Enables saving of training checkpoints and final model weights. Useful for resuming training or model deployment.

save_period int -1

保存模型检查点的频率,以 epoch 为单位指定。值为 -1 将禁用此功能。对于在长时间训练期间保存临时模型很有用。

Frequency of saving model checkpoints, specified in epochs. A value of -1 disables this feature. Useful for saving interim models during long training sessions.

cache bool False

启用数据集图像在内存 (True/ram)、磁盘 (disk) 上的缓存,或禁用数据集图像缓存 (False)。通过减少磁盘 I/O 来提高训练速度,但代价是内存使用量增加。

Enables caching of dataset images in memory (True/ram), on disk (disk), or disables it (False). Improves training speed by reducing disk I/O at the cost of increased memory usage.

device int or str or list None

指定用于训练的计算设备:单个 GPU (device=0)、多个 GPU (device=0,1)、CPU (device=cpu) 或 Apple Silicon 的 MPS (device=mps)。

Specifies the computational device(s) for training: a single GPU (device=0), multiple GPUs (device=0,1), CPU (device=cpu), or MPS for Apple silicon (device=mps).

workers int 8

用于数据加载的工作线程数(如果是多 GPU 训练,则按 RANK 计算)。影响数据预处理和馈送到模型的速度,在多 GPU 设置中特别有用。

Number of worker threads for data loading (per RANK if Multi-GPU training). Influences the speed of data preprocessing and feeding into the model, especially useful in multi-GPU setups.

project str None

保存训练输出的项目目录的名称。允许有序地存储不同的实验。

Name of the project directory where training outputs are saved. Allows for organized storage of different experiments.

name str None

训练运行的名称。用于在项目文件夹中创建子目录,用于存储训练日志和输出。

Name of the training run. Used for creating a subdirectory within the project folder, where training logs and outputs are stored.

exist_ok bool False

如果为 True,则允许覆盖现有 project/name 目录。对于迭代实验很有用,无需手动清除以前的输出。

If True, allows overwriting of an existing project/name directory. Useful for iterative experimentation without needing to manually clear previous outputs.

pretrained bool True

确定是否从预训练的模型开始训练。可以是布尔值或指向要从中加载权重的特定模型的字符串路径。提高训练效率和模型性能。

Determines whether to start training from a pretrained model. Can be a boolean value or a string path to a specific model from which to load weights. Enhances training efficiency and model performance.

optimizer str 'auto'

选择用于训练的优化器。选项包括 SGD、Adam、AdamW、NAdam、RAdam、RMSProp 等,或 auto,用于根据模型配置自动选择。影响收敛速度和稳定性。

Choice of optimizer for training. Options include SGDAdamAdamWNAdamRAdamRMSProp etc., or auto for automatic selection based on model configuration. Affects convergence speed and stability.

seed int 0

设置用于训练的随机种子,确保结果在具有相同配置的运行中的可重复性。

Sets the random seed for training, ensuring reproducibility of results across runs with the same configurations.

deterministic bool True

强制使用确定性算法,确保可重现性,但由于对非确定性算法的限制,可能会影响性能和速度。

Forces deterministic algorithm use, ensuring reproducibility but may affect performance and speed due to the restriction on non-deterministic algorithms.

single_cls bool False

在训练期间,将多类数据集中的所有类视为单个类。对于二元分类任务或专注于对象存在而不是分类时非常有用。

Treats all classes in multi-class datasets as a single class during training. Useful for binary classification tasks or when focusing on object presence rather than classification.

classes list[int] None

指定要训练的类 ID 列表。对于在训练期间筛选并仅关注某些类很有用。

Specifies a list of class IDs to train on. Useful for filtering out and focusing only on certain classes during training.

rect bool False

启用矩形训练,优化批量合成以实现最少的填充。可以提高效率和速度,但可能会影响模型准确性。

Enables rectangular training, optimizing batch composition for minimal padding. Can improve efficiency and speed but may affect model accuracy.

multi_scale bool False

通过在训练期间将 imgsz 增加/减少 0.5 倍来实现多尺度训练。在推理过程中使用多个 imgsz 训练模型以使其更准确。

Enables multi-scale training by increasing/decreasing imgsz by upto a factor of 0.5 during training. Trains the model to be more accurate with multiple imgsz during inference.

cos_lr bool False

利用余弦学习速率调度器,在历元上按照余弦曲线调整学习速率。有助于管理学习率以实现更好的收敛。

Utilizes a cosine learning rate scheduler, adjusting the learning rate following a cosine curve over epochs. Helps in managing learning rate for better convergence.

close_mosaic int 10 在最后N个时期禁用马赛克数据增强,以在完成前稳定训练。设置为0将禁用此功能。
Disables mosaic data augmentation in the last N epochs to stabilize training before completion. Setting to 0 disables this feature.
resume bool False

从上次保存的检查点恢复训练。自动加载模型权重、优化器状态和 epoch 计数,无缝继续训练。

Resumes training from the last saved checkpoint. Automatically loads model weights, optimizer state, and epoch count, continuing training seamlessly.

amp bool True

支持自动混合精度(AMP)训练,减少内存使用,并可能加快训练速度,同时对准确性的影响最小。

Enables Automatic Mixed Precision (AMP) training, reducing memory usage and possibly speeding up training with minimal impact on accuracy.

fraction float 1.0

指定用于训练的数据集的分数。允许对完整数据集的子集进行训练,这对于实验或资源有限时非常有用。

Specifies the fraction of the dataset to use for training. Allows for training on a subset of the full dataset, useful for experiments or when resources are limited.

profile bool False

在训练期间启用 ONNX 和 TensorRT 速度分析,有助于优化模型部署。

Enables profiling of ONNX and TensorRT speeds during training, useful for optimizing model deployment.

freeze int or list None

冻结模型的前N层或按索引指定的层,减少可训练参数的数量。有助于微调或迁移学习。

Freezes the first N layers of the model or specified layers by index, reducing the number of trainable parameters. Useful for fine-tuning or transfer learning.

lr0 float 0.01

初始学习率(即 SGD=1E-2,Adam=1E-3)。调整此值对于优化过程至关重要,它会影响模型权重的更新速度。

Initial learning rate (i.e. SGD=1E-2Adam=1E-3) . Adjusting this value is crucial for the optimization process, influencing how rapidly model weights are updated.

lrf float 0.01

最终学习率占初始学习率的分数 = (lr0 * lrf),与调度器结合使用,以随时间调整学习率。

Final learning rate as a fraction of the initial rate = (lr0 * lrf), used in conjunction with schedulers to adjust the learning rate over time.

momentum float 0.937

​SGD 的动量因子或 Adam 优化器的 beta1,影响当前更新中过去梯度的合并。

Momentum factor for SGD or beta1 for Adam optimizers, influencing the incorporation of past gradients in the current update.

weight_decay float 0.0005

L2正则化项,惩罚大权重以防止过度拟合。

L2 regularization term, penalizing large weights to prevent overfitting.

warmup_epochs float 3.0

学习率预热的 epoch 数,逐渐将学习率从低值提高到初始学习率,以尽早稳定训练。

Number of epochs for learning rate warmup, gradually increasing the learning rate from a low value to the initial learning rate to stabilize training early on.

warmup_momentum float 0.8

预热阶段的初始动量,在预热期间逐渐适应设定的动量。

Initial momentum for warmup phase, gradually adjusting to the set momentum over the warmup period.

warmup_bias_lr float 0.1

预热阶段偏差参数的学习率,有助于稳定初始 epoch 中的模型训练。

Learning rate for bias parameters during the warmup phase, helping stabilize model training in the initial epochs.

box float 7.5

损失函数中 box loss 分量的权重,影响对准确预测边界框坐标的重视程度。

Weight of the box loss component in the loss function, influencing how much emphasis is placed on accurately predicting bounding box coordinates.

cls float 0.5

分类损失在总损失函数中的权重,影响相对于其他分量的正确类别预测的重要性。

Weight of the classification loss in the total loss function, affecting the importance of correct class prediction relative to other components.

dfl float 1.5

分布焦点损失的权重,用于某些 YOLO 版本以进行精细分类。

Weight of the distribution focal loss, used in certain YOLO versions for fine-grained classification.

pose float 12.0

为姿势估计训练的模型中姿势损失的权重,影响对准确预测姿势关键点的重视。

Weight of the pose loss in models trained for pose estimation, influencing the emphasis on accurately predicting pose keypoints.

kobj float 2.0

姿态估计模型中关键点客体损失的权重,平衡检测置信度和姿态准确性。

Weight of the keypoint objectness loss in pose estimation models, balancing detection confidence with pose accuracy.

nbs int 64

损失标准化的标称批量。

Nominal batch size for normalization of loss.

overlap_mask bool True

确定是应将对象掩码合并到单个掩码中进行训练,还是为每个对象保持单独。在重叠的情况下,在合并过程中,较小的蒙版将覆盖在较大的蒙版之上。

Determines whether object masks should be merged into a single mask for training, or kept separate for each object. In case of overlap, the smaller mask is overlaid on top of the larger mask during merge.

mask_ratio int 4

分割掩码的下采样率,影响训练期间使用的掩码的分辨率。

Downsample ratio for segmentation masks, affecting the resolution of masks used during training.

dropout float 0.0

分类任务中正则化的 dropout rate,防止在训练期间通过随机省略单元来过度拟合。

Dropout rate for regularization in classification tasks, preventing overfitting by randomly omitting units during training.

val bool True

在训练期间启用验证,允许在单独的数据集上定期评估模型性能。

Enables validation during training, allowing for periodic evaluation of model performance on a separate dataset.

plots bool False

生成并保存训练和验证指标的图以及预测示例,提供对模型性能和学习进度的可视化见解。

Generates and saves plots of training and validation metrics, as well as prediction examples, providing visual insights into model performance and learning progression.

4.模型推理

训练结果保存在文件夹runs中

将my_predict实时推理.py 的模型换成自己训练的模型并运行


网站公告

今日签到

点亮在社区的每一天
去签到