YOLOv11 目标检测-EW帮帮网

本文章不再赘述anaconda的下载以及虚拟环境的配置，博主使用的python版本为3.8

1.获取YOLOv11的源工程文件

链接：GitHub - ultralytics/ultralytics: Ultralytics YOLO11 🚀

直接下载解压

2.需要自己准备的文件

文件结构如下：红框部分的文件需要自己补充（见下文）

yolo11n.pt （还是在同一个链接下）

my_train.py

import warnings
warnings.filterwarnings('ignore')
from ultralytics import YOLO

if __name__ == '__main__':
    model = YOLO('ultralytics/cfg/models/11/yolo11.yaml')
    model.load('yolo11n.pt') # loading pretrain weights
    model.train(data='E:/pytroch/YOLO/yolov11/ultralytics-main-250314/ultralytics-main/my_Data.yaml',
                imgsz=640,
                epochs=100,
                batch=32,
                workers=8,
                #close_mosaic=10,
                device='cpu',
                #device='0',
                optimizer='SGD', # using SGD
                #project='runs/train',
                #name='exp',
                amp=False,
                cache=False,  # 服务器可设置为True，训练速度变快
                )

.yaml的路径要使用绝对路径

my_predict图像推理.py (之后可换成自己训练的模型)

from ultralytics import YOLO

# Load a pre-trained YOLO model (adjust model type as needed)
model = YOLO("yolo11n.pt")  # n, s, m, l, x versions available

# Perform object detection on an image
results = model.predict(source="1.jpg")  # Can also use video, directory, URL, etc.

# Display the results
results[0].show()  # Show the first image results

my_predict实时推理.py (之后可换成自己训练的模型)

from ultralytics import YOLO
import cv2
import numpy as np
import time

# 加载模型
model = YOLO("yolo11n.pt")

# 打开视频文件
#video_path = "720p.mp4"  # 替换为你的视频路径
#cap = cv2.VideoCapture(video_path)
cap = cv2.VideoCapture(0) #使用外置USB摄像头

# 检查视频是否打开成功
if not cap.isOpened():
    print("Error: Cannot open video file.")
    exit()

# 获取视频信息
fps = int(cap.get(cv2.CAP_PROP_FPS))
print(f"Video FPS: {fps}")

# 获取视频宽高
frame_width = int(cap.get(cv2.CAP_PROP_FRAME_WIDTH))
frame_height = int(cap.get(cv2.CAP_PROP_FRAME_HEIGHT))

# 实时推理并显示结果
while True:
    start_time = time.time()  # 记录帧处理开始时间

    ret, frame = cap.read()
    if not ret:
        break  # 视频结束

    # 模型推理
    results = model(frame)  # 对当前帧进行推理

    for r in results:
        # 绘制检测结果图像
        im_array = r.plot()  # 带预测结果的图像 (BGR 格式)

        # 检查是否有实例分割掩码
        if r.masks is not None:
            # 计算总像素占比
            total_pixels = frame_width * frame_height  # 图像总像素数
            masks = r.masks.data.cpu().numpy()  # 获取实例分割的掩码
            total_object_pixels = np.sum(masks)  # 所有物体的像素总和
            total_percentage = (total_object_pixels / total_pixels) * 100  # 百分比计算

            # 在图像左上角绘制百分比信息
            text = f"Total percentage: {total_percentage:.2f}%"
        else:
            text = "No objects detected"  # 没有检测到物体时的提示信息

        font = cv2.FONT_HERSHEY_SIMPLEX
        font_scale = 1.2
        color = (0, 0, 255)
        thickness = 2
        position = (20, 40)  # 文本位置 (x, y)
        im_array = cv2.putText(im_array, text, position, font, font_scale, color, thickness)

        # 显示当前帧的推理结果
        cv2.imshow("Inference", im_array)

    # 计算帧率
    elapsed_time = time.time() - start_time
    print(f"Processing time per frame: {elapsed_time:.2f} seconds ({1/elapsed_time:.2f} FPS)")

    # 按 'q' 键退出
    if cv2.waitKey(1) & 0xFF == ord('q'):
        break

# 释放资源
cap.release()
cv2.destroyAllWindows()

my_Data.yaml

train: my_Dataset\train # train images (relative to 'path') 128 images
val: my_Dataset\val  # val images (relative to 'path') 128 images
test: my_Dataset\test

nc: 4

# Classes
names: ['recycle','hazardous','foodWaste','other']

路径为数据集的路径（见下文）

names的类别顺序：与labels标签一致即可

my_Dataset存放数据集（数据集文件结构如下）

my_Dataset
|
|
|
----> test

        ---->images

                     1.jpg

                     2.jpg

                     ……

        ---->labels

                     1.txt

                     2.txt

                     ……

----> train

        ---->images

                     3.jpg

                     4.jpg

                     ……

        ---->labels

                     3.txt

                     4.txt

                     ……

----> val

        ---->images

                     5.jpg

                     6.jpg

                     ……

        ---->labels

                     5.txt

                     6.txt

                     ……

3.模型训练

在所有文件配置好后运行my_train.py

关于训练参数的设置：

翻译置:Configuration - Ultralytics YOLO Docs

YOLO 模型的训练设置包含训练过程中使用的各种超参数和配置。这些设置会影响模型的性能、速度和准确性。关键训练设置包括批量大小、学习率、动量和权重衰减。此外，优化器、损失函数和训练数据集组合的选择也会影响训练过程。仔细调整和试验这些设置对于优化性能至关重要。

参数Argument	类型 Type	默认值 Default	描述 Description
`model`	`str`	`None`	指定用于训练的模型文件。接受 .pt 预训练模型或 .yaml 配置文件的路径。对于定义模型结构或初始化权重至关重要。 Specifies the model file for training. Accepts a path to either a `.pt` pretrained model or a `.yaml` configuration file. Essential for defining the model structure or initializing weights.
`data`	`str`	`None`	数据集配置文件的路径（例如，coco8.yaml）。该文件包含特定于数据集的参数，包括训练和验证数据的路径、类名和类数。 Path to the dataset configuration file (e.g., `coco8.yaml`). This file contains dataset-specific parameters, including paths to training and validation data, class names, and number of classes.
`epochs`	`int`	`100`	训练 epoch 的总数。每个 epoch 表示整个数据集的完整传递。调整此值可能会影响训练持续时间和模型性能。 Total number of training epochs. Each epoch represents a full pass over the entire dataset. Adjusting this value can affect training duration and model performance.
`time`	`float`	`None`	最大训练时间（小时）。如果设置，这将覆盖 epochs 参数，从而允许训练在指定的持续时间后自动停止。适用于时间受限的训练场景。 Maximum training time in hours. If set, this overrides the `epochs` argument, allowing training to automatically stop after the specified duration. Useful for time-constrained training scenarios.
`patience`	`int`	`100`	在提前停止训练之前，在验证指标没有改善的情况下等待的epochs数。通过在性能停滞时停止训练来帮助防止过度拟合。 Number of epochs to wait without improvement in validation metrics before early stopping the training. Helps prevent overfitting by stopping training when performance plateaus.
`batch`	`int`	`16`	具有三种模式：设置为整数（例如，batch=16）、GPU 内存利用率为 60% 的自动模式（batch=-1）或具有指定利用率分数的自动模式（batch=0.70）。 Batch size, with three modes: set as an integer (e.g., `batch=16`), auto mode for 60% GPU memory utilization (`batch=-1`), or auto mode with specified utilization fraction (`batch=0.70`).
`imgsz`	`int` or `list`	`640`	用于训练的目标图像大小。在输入到模型之前，所有图像都会调整到此尺寸。影响模型精度和计算复杂度。 Target image size for training. All images are resized to this dimension before being fed into the model. Affects model accuracy and computational complexity.
`save`	`bool`	`True`	支持保存训练检查点和最终模型权重。对于恢复训练或模型部署非常有用。 Enables saving of training checkpoints and final model weights. Useful for resuming training or model deployment.
`save_period`	`int`	`-1`	保存模型检查点的频率，以 epoch 为单位指定。值为 -1 将禁用此功能。对于在长时间训练期间保存临时模型很有用。 Frequency of saving model checkpoints, specified in epochs. A value of -1 disables this feature. Useful for saving interim models during long training sessions.
`cache`	`bool`	`False`	启用数据集图像在内存（True/ram）、磁盘（disk）上的缓存，或禁用数据集图像缓存（False）。通过减少磁盘 I/O 来提高训练速度，但代价是内存使用量增加。 Enables caching of dataset images in memory (`True`/`ram`), on disk (`disk`), or disables it (`False`). Improves training speed by reducing disk I/O at the cost of increased memory usage.
`device`	`int` or `str` or `list`	`None`	指定用于训练的计算设备：单个 GPU （device=0）、多个 GPU （device=0,1）、CPU （device=cpu）或 Apple Silicon 的 MPS （device=mps）。 Specifies the computational device(s) for training: a single GPU (`device=0`), multiple GPUs (`device=0,1`), CPU (`device=cpu`), or MPS for Apple silicon (`device=mps`).
`workers`	`int`	`8`	用于数据加载的工作线程数（如果是多 GPU 训练，则按 RANK 计算）。影响数据预处理和馈送到模型的速度，在多 GPU 设置中特别有用。 Number of worker threads for data loading (per `RANK` if Multi-GPU training). Influences the speed of data preprocessing and feeding into the model, especially useful in multi-GPU setups.
`project`	`str`	`None`	保存训练输出的项目目录的名称。允许有序地存储不同的实验。 Name of the project directory where training outputs are saved. Allows for organized storage of different experiments.
`name`	`str`	`None`	训练运行的名称。用于在项目文件夹中创建子目录，用于存储训练日志和输出。 Name of the training run. Used for creating a subdirectory within the project folder, where training logs and outputs are stored.
`exist_ok`	`bool`	`False`	如果为 True，则允许覆盖现有 project/name 目录。对于迭代实验很有用，无需手动清除以前的输出。 If True, allows overwriting of an existing project/name directory. Useful for iterative experimentation without needing to manually clear previous outputs.
`pretrained`	`bool`	`True`	确定是否从预训练的模型开始训练。可以是布尔值或指向要从中加载权重的特定模型的字符串路径。提高训练效率和模型性能。 Determines whether to start training from a pretrained model. Can be a boolean value or a string path to a specific model from which to load weights. Enhances training efficiency and model performance.
`optimizer`	`str`	`'auto'`	选择用于训练的优化器。选项包括 SGD、Adam、AdamW、NAdam、RAdam、RMSProp 等，或 auto，用于根据模型配置自动选择。影响收敛速度和稳定性。 Choice of optimizer for training. Options include `SGD`, `Adam`, `AdamW`, `NAdam`, `RAdam`, `RMSProp` etc., or `auto` for automatic selection based on model configuration. Affects convergence speed and stability.
`seed`	`int`	`0`	设置用于训练的随机种子，确保结果在具有相同配置的运行中的可重复性。 Sets the random seed for training, ensuring reproducibility of results across runs with the same configurations.
`deterministic`	`bool`	`True`	强制使用确定性算法，确保可重现性，但由于对非确定性算法的限制，可能会影响性能和速度。 Forces deterministic algorithm use, ensuring reproducibility but may affect performance and speed due to the restriction on non-deterministic algorithms.
`single_cls`	`bool`	`False`	在训练期间，将多类数据集中的所有类视为单个类。对于二元分类任务或专注于对象存在而不是分类时非常有用。 Treats all classes in multi-class datasets as a single class during training. Useful for binary classification tasks or when focusing on object presence rather than classification.
`classes`	`list[int]`	`None`	指定要训练的类 ID 列表。对于在训练期间筛选并仅关注某些类很有用。 Specifies a list of class IDs to train on. Useful for filtering out and focusing only on certain classes during training.
`rect`	`bool`	`False`	启用矩形训练，优化批量合成以实现最少的填充。可以提高效率和速度，但可能会影响模型准确性。 Enables rectangular training, optimizing batch composition for minimal padding. Can improve efficiency and speed but may affect model accuracy.
`multi_scale`	`bool`	`False`	通过在训练期间将 imgsz 增加/减少 0.5 倍来实现多尺度训练。在推理过程中使用多个 imgsz 训练模型以使其更准确。 Enables multi-scale training by increasing/decreasing `imgsz` by upto a factor of `0.5` during training. Trains the model to be more accurate with multiple `imgsz` during inference.
`cos_lr`	`bool`	`False`	利用余弦学习速率调度器，在历元上按照余弦曲线调整学习速率。有助于管理学习率以实现更好的收敛。 Utilizes a cosine learning rate scheduler, adjusting the learning rate following a cosine curve over epochs. Helps in managing learning rate for better convergence.
`close_mosaic`	`int`	`10`	在最后N个时期禁用马赛克数据增强，以在完成前稳定训练。设置为0将禁用此功能。 Disables mosaic data augmentation in the last N epochs to stabilize training before completion. Setting to 0 disables this feature.
`resume`	`bool`	`False`	从上次保存的检查点恢复训练。自动加载模型权重、优化器状态和 epoch 计数，无缝继续训练。 Resumes training from the last saved checkpoint. Automatically loads model weights, optimizer state, and epoch count, continuing training seamlessly.
`amp`	`bool`	`True`	支持自动混合精度（AMP）训练，减少内存使用，并可能加快训练速度，同时对准确性的影响最小。 Enables Automatic Mixed Precision (AMP) training, reducing memory usage and possibly speeding up training with minimal impact on accuracy.
`fraction`	`float`	`1.0`	指定用于训练的数据集的分数。允许对完整数据集的子集进行训练，这对于实验或资源有限时非常有用。 Specifies the fraction of the dataset to use for training. Allows for training on a subset of the full dataset, useful for experiments or when resources are limited.
`profile`	`bool`	`False`	在训练期间启用 ONNX 和 TensorRT 速度分析，有助于优化模型部署。 Enables profiling of ONNX and TensorRT speeds during training, useful for optimizing model deployment.
`freeze`	`int` or `list`	`None`	冻结模型的前N层或按索引指定的层，减少可训练参数的数量。有助于微调或迁移学习。 Freezes the first N layers of the model or specified layers by index, reducing the number of trainable parameters. Useful for fine-tuning or transfer learning.
`lr0`	`float`	`0.01`	初始学习率（即 SGD=1E-2，Adam=1E-3）。调整此值对于优化过程至关重要，它会影响模型权重的更新速度。 Initial learning rate (i.e. `SGD=1E-2`, `Adam=1E-3`) . Adjusting this value is crucial for the optimization process, influencing how rapidly model weights are updated.
`lrf`	`float`	`0.01`	最终学习率占初始学习率的分数 = （lr0 * lrf），与调度器结合使用，以随时间调整学习率。 Final learning rate as a fraction of the initial rate = (`lr0 * lrf`), used in conjunction with schedulers to adjust the learning rate over time.
`momentum`	`float`	`0.937`	SGD 的动量因子或 Adam 优化器的 beta1，影响当前更新中过去梯度的合并。 Momentum factor for SGD or beta1 for Adam optimizers, influencing the incorporation of past gradients in the current update.
`weight_decay`	`float`	`0.0005`	L2正则化项，惩罚大权重以防止过度拟合。 L2 regularization term, penalizing large weights to prevent overfitting.
`warmup_epochs`	`float`	`3.0`	学习率预热的 epoch 数，逐渐将学习率从低值提高到初始学习率，以尽早稳定训练。 Number of epochs for learning rate warmup, gradually increasing the learning rate from a low value to the initial learning rate to stabilize training early on.
`warmup_momentum`	`float`	`0.8`	预热阶段的初始动量，在预热期间逐渐适应设定的动量。 Initial momentum for warmup phase, gradually adjusting to the set momentum over the warmup period.
`warmup_bias_lr`	`float`	`0.1`	预热阶段偏差参数的学习率，有助于稳定初始 epoch 中的模型训练。 Learning rate for bias parameters during the warmup phase, helping stabilize model training in the initial epochs.
`box`	`float`	`7.5`	损失函数中 box loss 分量的权重，影响对准确预测边界框坐标的重视程度。 Weight of the box loss component in the loss function, influencing how much emphasis is placed on accurately predicting bounding box coordinates.
`cls`	`float`	`0.5`	分类损失在总损失函数中的权重，影响相对于其他分量的正确类别预测的重要性。 Weight of the classification loss in the total loss function, affecting the importance of correct class prediction relative to other components.
`dfl`	`float`	`1.5`	分布焦点损失的权重，用于某些 YOLO 版本以进行精细分类。 Weight of the distribution focal loss, used in certain YOLO versions for fine-grained classification.
`pose`	`float`	`12.0`	为姿势估计训练的模型中姿势损失的权重，影响对准确预测姿势关键点的重视。 Weight of the pose loss in models trained for pose estimation, influencing the emphasis on accurately predicting pose keypoints.
`kobj`	`float`	`2.0`	姿态估计模型中关键点客体损失的权重，平衡检测置信度和姿态准确性。 Weight of the keypoint objectness loss in pose estimation models, balancing detection confidence with pose accuracy.
`nbs`	`int`	`64`	损失标准化的标称批量。 Nominal batch size for normalization of loss.
`overlap_mask`	`bool`	`True`	确定是应将对象掩码合并到单个掩码中进行训练，还是为每个对象保持单独。在重叠的情况下，在合并过程中，较小的蒙版将覆盖在较大的蒙版之上。 Determines whether object masks should be merged into a single mask for training, or kept separate for each object. In case of overlap, the smaller mask is overlaid on top of the larger mask during merge.
`mask_ratio`	`int`	`4`	分割掩码的下采样率，影响训练期间使用的掩码的分辨率。 Downsample ratio for segmentation masks, affecting the resolution of masks used during training.
`dropout`	`float`	`0.0`	分类任务中正则化的 dropout rate，防止在训练期间通过随机省略单元来过度拟合。 Dropout rate for regularization in classification tasks, preventing overfitting by randomly omitting units during training.
`val`	`bool`	`True`	在训练期间启用验证，允许在单独的数据集上定期评估模型性能。 Enables validation during training, allowing for periodic evaluation of model performance on a separate dataset.
`plots`	`bool`	`False`	生成并保存训练和验证指标的图以及预测示例，提供对模型性能和学习进度的可视化见解。 Generates and saves plots of training and validation metrics, as well as prediction examples, providing visual insights into model performance and learning progression.

4.模型推理

训练结果保存在文件夹runs中

将my_predict实时推理.py 的模型换成自己训练的模型并运行

YOLOv11 目标检测

1.获取YOLOv11的源工程文件

2.需要自己准备的文件

3.模型训练

关于训练参数的设置：

4.模型推理

网站公告

今日签到

热门文章

最新发布