本文章不再赘述anaconda的下载以及虚拟环境的配置,博主使用的python版本为3.8
1.获取YOLOv11的源工程文件
链接:GitHub - ultralytics/ultralytics: Ultralytics YOLO11 🚀
直接下载解压
2.需要自己准备的文件
文件结构如下:红框部分的文件需要自己补充(见下文)
yolo11n.pt (还是在同一个链接下)
my_train.py
import warnings
warnings.filterwarnings('ignore')
from ultralytics import YOLO
if __name__ == '__main__':
model = YOLO('ultralytics/cfg/models/11/yolo11.yaml')
model.load('yolo11n.pt') # loading pretrain weights
model.train(data='E:/pytroch/YOLO/yolov11/ultralytics-main-250314/ultralytics-main/my_Data.yaml',
imgsz=640,
epochs=100,
batch=32,
workers=8,
#close_mosaic=10,
device='cpu',
#device='0',
optimizer='SGD', # using SGD
#project='runs/train',
#name='exp',
amp=False,
cache=False, # 服务器可设置为True,训练速度变快
)
.yaml的路径要使用绝对路径
my_predict图像推理.py (之后可换成自己训练的模型)
from ultralytics import YOLO
# Load a pre-trained YOLO model (adjust model type as needed)
model = YOLO("yolo11n.pt") # n, s, m, l, x versions available
# Perform object detection on an image
results = model.predict(source="1.jpg") # Can also use video, directory, URL, etc.
# Display the results
results[0].show() # Show the first image results
my_predict实时推理.py (之后可换成自己训练的模型)
from ultralytics import YOLO
import cv2
import numpy as np
import time
# 加载模型
model = YOLO("yolo11n.pt")
# 打开视频文件
#video_path = "720p.mp4" # 替换为你的视频路径
#cap = cv2.VideoCapture(video_path)
cap = cv2.VideoCapture(0) #使用外置USB摄像头
# 检查视频是否打开成功
if not cap.isOpened():
print("Error: Cannot open video file.")
exit()
# 获取视频信息
fps = int(cap.get(cv2.CAP_PROP_FPS))
print(f"Video FPS: {fps}")
# 获取视频宽高
frame_width = int(cap.get(cv2.CAP_PROP_FRAME_WIDTH))
frame_height = int(cap.get(cv2.CAP_PROP_FRAME_HEIGHT))
# 实时推理并显示结果
while True:
start_time = time.time() # 记录帧处理开始时间
ret, frame = cap.read()
if not ret:
break # 视频结束
# 模型推理
results = model(frame) # 对当前帧进行推理
for r in results:
# 绘制检测结果图像
im_array = r.plot() # 带预测结果的图像 (BGR 格式)
# 检查是否有实例分割掩码
if r.masks is not None:
# 计算总像素占比
total_pixels = frame_width * frame_height # 图像总像素数
masks = r.masks.data.cpu().numpy() # 获取实例分割的掩码
total_object_pixels = np.sum(masks) # 所有物体的像素总和
total_percentage = (total_object_pixels / total_pixels) * 100 # 百分比计算
# 在图像左上角绘制百分比信息
text = f"Total percentage: {total_percentage:.2f}%"
else:
text = "No objects detected" # 没有检测到物体时的提示信息
font = cv2.FONT_HERSHEY_SIMPLEX
font_scale = 1.2
color = (0, 0, 255)
thickness = 2
position = (20, 40) # 文本位置 (x, y)
im_array = cv2.putText(im_array, text, position, font, font_scale, color, thickness)
# 显示当前帧的推理结果
cv2.imshow("Inference", im_array)
# 计算帧率
elapsed_time = time.time() - start_time
print(f"Processing time per frame: {elapsed_time:.2f} seconds ({1/elapsed_time:.2f} FPS)")
# 按 'q' 键退出
if cv2.waitKey(1) & 0xFF == ord('q'):
break
# 释放资源
cap.release()
cv2.destroyAllWindows()
my_Data.yaml
train: my_Dataset\train # train images (relative to 'path') 128 images
val: my_Dataset\val # val images (relative to 'path') 128 images
test: my_Dataset\test
nc: 4
# Classes
names: ['recycle','hazardous','foodWaste','other']
路径为数据集的路径(见下文)
names的类别顺序:与labels标签一致即可
my_Dataset存放数据集(数据集文件结构如下)
my_Dataset
|
|
|
----> test
---->images
1.jpg
2.jpg
……
---->labels
1.txt
2.txt
……
----> train
---->images
3.jpg
4.jpg
……
---->labels
3.txt
4.txt
……
----> val
---->images
5.jpg
6.jpg
……
---->labels
5.txt
6.txt
……
3.模型训练
在所有文件配置好后运行my_train.py
关于训练参数的设置:
翻译置:Configuration - Ultralytics YOLO Docs
YOLO 模型的训练设置包含训练过程中使用的各种超参数和配置。这些设置会影响模型的性能、速度和准确性。关键训练设置包括批量大小、学习率、动量和权重衰减。此外,优化器、损失函数和训练数据集组合的选择也会影响训练过程。仔细调整和试验这些设置对于优化性能至关重要。
参数Argument | 类型 Type |
默认值 Default |
描述 Description |
---|---|---|---|
model |
str |
None |
指定用于训练的模型文件。接受 .pt 预训练模型或 .yaml 配置文件的路径。对于定义模型结构或初始化权重至关重要。 Specifies the model file for training. Accepts a path to either a |
data |
str |
None |
数据集配置文件的路径(例如,coco8.yaml)。该文件包含特定于数据集的参数,包括训练和验证数据的路径、类名和类数。 Path to the dataset configuration file (e.g., |
epochs |
int |
100 |
训练 epoch 的总数。每个 epoch 表示整个数据集的完整传递。调整此值可能会影响训练持续时间和模型性能。 Total number of training epochs. Each epoch represents a full pass over the entire dataset. Adjusting this value can affect training duration and model performance. |
time |
float |
None |
最大训练时间(小时)。如果设置,这将覆盖 epochs 参数,从而允许训练在指定的持续时间后自动停止。适用于时间受限的训练场景。 Maximum training time in hours. If set, this overrides the |
patience |
int |
100 |
在提前停止训练之前,在验证指标没有改善的情况下等待的epochs数。通过在性能停滞时停止训练来帮助防止过度拟合。 Number of epochs to wait without improvement in validation metrics before early stopping the training. Helps prevent overfitting by stopping training when performance plateaus. |
batch |
int |
16 |
具有三种模式:设置为整数(例如,batch=16)、GPU 内存利用率为 60% 的自动模式 (batch=-1) 或具有指定利用率分数的自动模式 (batch=0.70)。 Batch size, with three modes: set as an integer (e.g., |
imgsz |
int or list |
640 |
用于训练的目标图像大小。在输入到模型之前,所有图像都会调整到此尺寸。影响模型精度和计算复杂度。 Target image size for training. All images are resized to this dimension before being fed into the model. Affects model accuracy and computational complexity. |
save |
bool |
True |
支持保存训练检查点和最终模型权重。对于恢复训练或模型部署非常有用。 Enables saving of training checkpoints and final model weights. Useful for resuming training or model deployment. |
save_period |
int |
-1 |
保存模型检查点的频率,以 epoch 为单位指定。值为 -1 将禁用此功能。对于在长时间训练期间保存临时模型很有用。 Frequency of saving model checkpoints, specified in epochs. A value of -1 disables this feature. Useful for saving interim models during long training sessions. |
cache |
bool |
False |
启用数据集图像在内存 (True/ram)、磁盘 (disk) 上的缓存,或禁用数据集图像缓存 (False)。通过减少磁盘 I/O 来提高训练速度,但代价是内存使用量增加。 Enables caching of dataset images in memory ( |
device |
int or str or list |
None |
指定用于训练的计算设备:单个 GPU (device=0)、多个 GPU (device=0,1)、CPU (device=cpu) 或 Apple Silicon 的 MPS (device=mps)。 Specifies the computational device(s) for training: a single GPU ( |
workers |
int |
8 |
用于数据加载的工作线程数(如果是多 GPU 训练,则按 RANK 计算)。影响数据预处理和馈送到模型的速度,在多 GPU 设置中特别有用。 Number of worker threads for data loading (per |
project |
str |
None |
保存训练输出的项目目录的名称。允许有序地存储不同的实验。 Name of the project directory where training outputs are saved. Allows for organized storage of different experiments. |
name |
str |
None |
训练运行的名称。用于在项目文件夹中创建子目录,用于存储训练日志和输出。 Name of the training run. Used for creating a subdirectory within the project folder, where training logs and outputs are stored. |
exist_ok |
bool |
False |
如果为 True,则允许覆盖现有 project/name 目录。对于迭代实验很有用,无需手动清除以前的输出。 If True, allows overwriting of an existing project/name directory. Useful for iterative experimentation without needing to manually clear previous outputs. |
pretrained |
bool |
True |
确定是否从预训练的模型开始训练。可以是布尔值或指向要从中加载权重的特定模型的字符串路径。提高训练效率和模型性能。 Determines whether to start training from a pretrained model. Can be a boolean value or a string path to a specific model from which to load weights. Enhances training efficiency and model performance. |
optimizer |
str |
'auto' |
选择用于训练的优化器。选项包括 SGD、Adam、AdamW、NAdam、RAdam、RMSProp 等,或 auto,用于根据模型配置自动选择。影响收敛速度和稳定性。 Choice of optimizer for training. Options include |
seed |
int |
0 |
设置用于训练的随机种子,确保结果在具有相同配置的运行中的可重复性。 Sets the random seed for training, ensuring reproducibility of results across runs with the same configurations. |
deterministic |
bool |
True |
强制使用确定性算法,确保可重现性,但由于对非确定性算法的限制,可能会影响性能和速度。 Forces deterministic algorithm use, ensuring reproducibility but may affect performance and speed due to the restriction on non-deterministic algorithms. |
single_cls |
bool |
False |
在训练期间,将多类数据集中的所有类视为单个类。对于二元分类任务或专注于对象存在而不是分类时非常有用。 Treats all classes in multi-class datasets as a single class during training. Useful for binary classification tasks or when focusing on object presence rather than classification. |
classes |
list[int] |
None |
指定要训练的类 ID 列表。对于在训练期间筛选并仅关注某些类很有用。 Specifies a list of class IDs to train on. Useful for filtering out and focusing only on certain classes during training. |
rect |
bool |
False |
启用矩形训练,优化批量合成以实现最少的填充。可以提高效率和速度,但可能会影响模型准确性。 Enables rectangular training, optimizing batch composition for minimal padding. Can improve efficiency and speed but may affect model accuracy. |
multi_scale |
bool |
False |
通过在训练期间将 imgsz 增加/减少 0.5 倍来实现多尺度训练。在推理过程中使用多个 imgsz 训练模型以使其更准确。 Enables multi-scale training by increasing/decreasing |
cos_lr |
bool |
False |
利用余弦学习速率调度器,在历元上按照余弦曲线调整学习速率。有助于管理学习率以实现更好的收敛。 Utilizes a cosine learning rate scheduler, adjusting the learning rate following a cosine curve over epochs. Helps in managing learning rate for better convergence. |
close_mosaic |
int |
10 |
在最后N个时期禁用马赛克数据增强,以在完成前稳定训练。设置为0将禁用此功能。 Disables mosaic data augmentation in the last N epochs to stabilize training before completion. Setting to 0 disables this feature. |
resume |
bool |
False |
从上次保存的检查点恢复训练。自动加载模型权重、优化器状态和 epoch 计数,无缝继续训练。 Resumes training from the last saved checkpoint. Automatically loads model weights, optimizer state, and epoch count, continuing training seamlessly. |
amp |
bool |
True |
支持自动混合精度(AMP)训练,减少内存使用,并可能加快训练速度,同时对准确性的影响最小。 Enables Automatic Mixed Precision (AMP) training, reducing memory usage and possibly speeding up training with minimal impact on accuracy. |
fraction |
float |
1.0 |
指定用于训练的数据集的分数。允许对完整数据集的子集进行训练,这对于实验或资源有限时非常有用。 Specifies the fraction of the dataset to use for training. Allows for training on a subset of the full dataset, useful for experiments or when resources are limited. |
profile |
bool |
False |
在训练期间启用 ONNX 和 TensorRT 速度分析,有助于优化模型部署。 Enables profiling of ONNX and TensorRT speeds during training, useful for optimizing model deployment. |
freeze |
int or list |
None |
冻结模型的前N层或按索引指定的层,减少可训练参数的数量。有助于微调或迁移学习。 Freezes the first N layers of the model or specified layers by index, reducing the number of trainable parameters. Useful for fine-tuning or transfer learning. |
lr0 |
float |
0.01 |
初始学习率(即 SGD=1E-2,Adam=1E-3)。调整此值对于优化过程至关重要,它会影响模型权重的更新速度。 Initial learning rate (i.e. |
lrf |
float |
0.01 |
最终学习率占初始学习率的分数 = (lr0 * lrf),与调度器结合使用,以随时间调整学习率。 Final learning rate as a fraction of the initial rate = ( |
momentum |
float |
0.937 |
SGD 的动量因子或 Adam 优化器的 beta1,影响当前更新中过去梯度的合并。 Momentum factor for SGD or beta1 for Adam optimizers, influencing the incorporation of past gradients in the current update. |
weight_decay |
float |
0.0005 |
L2正则化项,惩罚大权重以防止过度拟合。 L2 regularization term, penalizing large weights to prevent overfitting. |
warmup_epochs |
float |
3.0 |
学习率预热的 epoch 数,逐渐将学习率从低值提高到初始学习率,以尽早稳定训练。 Number of epochs for learning rate warmup, gradually increasing the learning rate from a low value to the initial learning rate to stabilize training early on. |
warmup_momentum |
float |
0.8 |
预热阶段的初始动量,在预热期间逐渐适应设定的动量。 Initial momentum for warmup phase, gradually adjusting to the set momentum over the warmup period. |
warmup_bias_lr |
float |
0.1 |
预热阶段偏差参数的学习率,有助于稳定初始 epoch 中的模型训练。 Learning rate for bias parameters during the warmup phase, helping stabilize model training in the initial epochs. |
box |
float |
7.5 |
损失函数中 box loss 分量的权重,影响对准确预测边界框坐标的重视程度。 Weight of the box loss component in the loss function, influencing how much emphasis is placed on accurately predicting bounding box coordinates. |
cls |
float |
0.5 |
分类损失在总损失函数中的权重,影响相对于其他分量的正确类别预测的重要性。 Weight of the classification loss in the total loss function, affecting the importance of correct class prediction relative to other components. |
dfl |
float |
1.5 |
分布焦点损失的权重,用于某些 YOLO 版本以进行精细分类。 Weight of the distribution focal loss, used in certain YOLO versions for fine-grained classification. |
pose |
float |
12.0 |
为姿势估计训练的模型中姿势损失的权重,影响对准确预测姿势关键点的重视。 Weight of the pose loss in models trained for pose estimation, influencing the emphasis on accurately predicting pose keypoints. |
kobj |
float |
2.0 |
姿态估计模型中关键点客体损失的权重,平衡检测置信度和姿态准确性。 Weight of the keypoint objectness loss in pose estimation models, balancing detection confidence with pose accuracy. |
nbs |
int |
64 |
损失标准化的标称批量。 Nominal batch size for normalization of loss. |
overlap_mask |
bool |
True |
确定是应将对象掩码合并到单个掩码中进行训练,还是为每个对象保持单独。在重叠的情况下,在合并过程中,较小的蒙版将覆盖在较大的蒙版之上。 Determines whether object masks should be merged into a single mask for training, or kept separate for each object. In case of overlap, the smaller mask is overlaid on top of the larger mask during merge. |
mask_ratio |
int |
4 |
分割掩码的下采样率,影响训练期间使用的掩码的分辨率。 Downsample ratio for segmentation masks, affecting the resolution of masks used during training. |
dropout |
float |
0.0 |
分类任务中正则化的 dropout rate,防止在训练期间通过随机省略单元来过度拟合。 Dropout rate for regularization in classification tasks, preventing overfitting by randomly omitting units during training. |
val |
bool |
True |
在训练期间启用验证,允许在单独的数据集上定期评估模型性能。 Enables validation during training, allowing for periodic evaluation of model performance on a separate dataset. |
plots |
bool |
False |
生成并保存训练和验证指标的图以及预测示例,提供对模型性能和学习进度的可视化见解。 Generates and saves plots of training and validation metrics, as well as prediction examples, providing visual insights into model performance and learning progression. |
4.模型推理
训练结果保存在文件夹runs中
将my_predict实时推理.py 的模型换成自己训练的模型并运行