YOLOv5配置训练以及华为昇腾910B推理-EW帮帮网

参考文章：

保姆式yolov5教程，训练你自己的数据集 - 知乎

Windows 10|11下安装mmyolo-0.5.0版本 - 知乎

Ubuntu22.04安装教程&基于华为Ascend AI处理器的om模型atc转换环境安装_ubuntu安装atc工具-CSDN博客嵌入式AI---在华为昇腾推理自己的yolov5目标检测模型_昇腾 yolo-CSDN博客

YOLOv5配置

使用conda创建新的虚拟环境并激活

conda create -n openmmlab python=3.8 -y
conda activate openmmlab

pytorch安装

CPU:

conda install pytorch==1.13.1 torchvision==0.14.1 torchaudio==0.13.1 cpuonly -c pytorch

Nvidia显卡且CUDA>=11.6:

conda install pytorch==1.13.1 torchvision==0.14.1 torchaudio==0.13.1 pytorch-cuda=11.6 -c pytorch -c nvidia

其他的pytorch版本没测试

下载mmyolo-0.5.0，进入mmyolo-0.5.0目录下

cd path  # 这里的path就是上面复制的路径，用你自己的路径替换path
# --------------进入到目录中后直接复制下面的命令并执行-------------------------
pip install chardet
pip install -U openmim
mim install -r requirements/mminstall.txt
# Install albumentations
pip install -r requirements/albu.txt
# Install MMYOLO，使用可编辑模式安装，以后编辑这个文件夹下的代码会对整个环境生效
mim install -v -e .

YOLOv5训练

进入yolov5文件夹目录

cd [path_to_yolov5]

数据集格式

在yolov5目录下新建文件夹dataset

road #(数据集名字) 
├── images      
       ├── train          
              ├── xx.jpg     
       ├── val         
              ├── xx.jpg 
├── labels      
       ├── train          
              ├── xx.txt     
       ├── val         
              ├── xx.txt

在yolov5/data文件夹下新建road.yaml

内容如下所示：

# Train/val/test sets as 1) dir: path/to/imgs, 2) file: path/to/imgs.txt, or 3) list: [path/to/imgs1, path/to/imgs2, ..]
path: /opt/data/private/zyx/yolov5/dataset/road8802  # dataset root dir
train: images/train  # train images (relative to 'path')
val: images/val  # val images (relative to 'path')
test:  # test images (optional)

# Classes
nc: 5  # number of classes
names: ['VT', 'VC', 'RS', 'VR','LN']  # class names

其中：

path：数据集的根目录
train：训练集与path的相对路径
val：验证集与path的相对路径
nc：类别数量，因为这个数据集只有一个类别（fire），nc即为1。
names：类别名字。

训练

python train.py --weights yolov5s.pt --data data/road.yaml --epochs 200 --workers 1 --batch-size 64

推理

模型训练完成后，将runs/exp/weights下的模型（best.pt）复制在yolov5文件夹下

python detect.py --weights best.onnx --source ../yolov5/dataset/road7190/images/val --data data/road.yaml

pt->onnx

python export.py --weights best.pt --data data/road.yaml --include onnx

完成后会在目录下看到best.onnx文件

ATC转OM文件

创建python3.7.5的虚拟环境和安装python依赖

conda create -n v3onnx python=3.7.5

pip3 install attrs numpy decorator sympy cffi pyyaml pathlib2 psutil protobuf scipy requests absl-py

下载CANN的两个包

在昇腾社区官网上下载CANN的两个包：toolkit和nnrt包

一定要根据推理时使用的cann版本下载对应版本的包

这是我的镜像：model_convert_cann7.0_aarch64_910b_py310:v6.0

可以看到cann时7.0版本

社区版资源下载-资源下载中心-昇腾社区

Ascend-cann-nnrt_7.0.1_linux-x86_64.run

Ascend-cann-toolkit_7.0.1_linux-x86_64.run

安装包

将包下载好放入虚拟环境中，开始安装：

chmod +x Ascend-cann-nnrt_7.0.1_linux-x86_64.run
chmod +x Ascend-cann-toolkit_7.0.1_linux-x86_64.run
#赋予权限
 
./Ascend-cann-nnrt_7.0.1_linux-x86_64.run --check
./Ascend-cann-toolkit_7.0.1_linux-x86_64.run --check
#检查安装包完整性
 
./Ascend-cann-nnrt_7.0.1_linux-x86_64.run --install
./Ascend-cann-toolkit_7.0.1_linux-x86_64.run --install

配置ATC运行环境

由于环境未配置，哪怕我们安装了CANN包也无法使用atc命令，因此我们需要配置环境

vi ~/.bashrc
 
export ASCEND_TOOLKIT_HOME=/usr/local/Ascend/ascend-toolkit/latest
export LD_LIBRARY_PATH=${ASCEND_TOOLKIT_HOME}/lib64:${ASCEND_TOOLKIT_HOME}/lib64/plugin/opskernel:${ASCEND_TOOLKIT_HOME}/lib64/plugin/nnengine:/usr/local/Ascend/ascend-toolkit/latest/x86_64-linux/devlib/:$LD_LIBRARY_PATH
export PYTHONPATH=${ASCEND_TOOLKIT_HOME}/python/site-packages:${ASCEND_TOOLKIT_HOME}/opp/op_impl/built-in/ai_core/tbe:$PYTHONPATH
export PATH=${ASCEND_TOOLKIT_HOME}/bin:${ASCEND_TOOLKIT_HOME}/compiler/ccec_compiler/bin:$PATH
export ASCEND_AICPU_PATH=${ASCEND_TOOLKIT_HOME}
export ASCEND_OPP_PATH=${ASCEND_TOOLKIT_HOME}/opp
export TOOLCHAIN_HOME=${ASCEND_TOOLKIT_HOME}/toolkit
export ASCEND_HOME_PATH=${ASCEND_TOOLKIT_HOME}
 
source ~/.bashrc

export后点击Esc，然后输入:wq来保存退出

然后即可使用atc命令

使用ATC命令

atc --model=best.onnx --framework=5 --output=best --input_format=NCHW --soc_version=Ascend910B2

--soc_version这个参数一定要通过查看npu-smi info来看，一定要一模一样，不能忽略后面的数字

然后即可生成best.om文件，之后就可以在昇腾910B上推理

910B推理

代码编写

新建一个文件夹om_infer：

新建detect.py、det_utils.py、labels.txt；将转换好的om模型复制到文件夹中；try保存原始的推理图片

填充labels.txt

一个类别一行

编写detect.py(检测)和det_util.py(后处理)

det_utils.py

import cv2
import numpy as np



def letterbox(img, new_shape=(640, 640), color=(114, 114, 114), auto=False, scaleFill=False, scaleup=True):
    # Resize image to a 32-pixel-multiple rectangle https://github.com/ultralytics/yolov3/issues/232
    shape = img.shape[:2]  # current shape [height, width]
    if isinstance(new_shape, int):
        new_shape = (new_shape, new_shape)

    # Scale ratio (new / old)
    r = min(new_shape[0] / shape[0], new_shape[1] / shape[1])
    if not scaleup:  # only scale down, do not scale up (for better test mAP)
        r = min(r, 1.0)

    # Compute padding
    ratio = r, r  # width, height ratios
    new_unpad = int(round(shape[1] * r)), int(round(shape[0] * r))
    dw, dh = new_shape[1] - new_unpad[0], new_shape[0] - new_unpad[1]  # wh padding
    if auto:  # minimum rectangle
        dw, dh = np.mod(dw, 64), np.mod(dh, 64)  # wh padding
    elif scaleFill:  # stretch
        dw, dh = 0.0, 0.0
        new_unpad = (new_shape[1], new_shape[0])
        ratio = new_shape[1] / shape[1], new_shape[0] / shape[0]  # width, height ratios

    dw /= 2  # divide padding into 2 sides
    dh /= 2

    if shape[::-1] != new_unpad:  # resize
        img = cv2.resize(img, new_unpad, interpolation=cv2.INTER_LINEAR)
    top, bottom = int(round(dh - 0.1)), int(round(dh + 0.1))
    left, right = int(round(dw - 0.1)), int(round(dw + 0.1))
    img = cv2.copyMakeBorder(img, top, bottom, left, right, cv2.BORDER_CONSTANT, value=color)  # add border
    return img, ratio, (dw, dh)


def xyxy2xywh(x):
    # Convert nx4 boxes from [x1, y1, x2, y2] to [x, y, w, h] where xy1=top-left, xy2=bottom-right
    y = np.copy(x)
    #y = x.clone() if isinstance(x, torch.Tensor) else np.copy(x)
    y[:, 0] = (x[:, 0] + x[:, 2]) / 2  # x center
    y[:, 1] = (x[:, 1] + x[:, 3]) / 2  # y center
    y[:, 2] = x[:, 2] - x[:, 0]  # width
    y[:, 3] = x[:, 3] - x[:, 1]  # height
    return y




def xywh2xyxy(x):
    # Convert nx4 boxes from [x, y, w, h] to [x1, y1, x2, y2] where xy1=top-left, xy2=bottom-right
    y = np.copy(x)
    #y = x.clone() if isinstance(x, torch.Tensor) else np.copy(x)
    y[:, 0] = x[:, 0] - x[:, 2] / 2  # top left x
    y[:, 1] = x[:, 1] - x[:, 3] / 2  # top left y
    y[:, 2] = x[:, 0] + x[:, 2] / 2  # bottom right x
    y[:, 3] = x[:, 1] + x[:, 3] / 2  # bottom right y
    return y


def numpy_nms(predictions, conf_thres=0.25, iou_thres=0.45):
    """修正后的NMS函数"""
    # 处理输入维度
    if predictions.ndim == 3:
        predictions = predictions.squeeze(0)

    # 过滤低置信度预测
    mask = predictions[:, 4] >= conf_thres
    predictions = predictions[mask]

    if predictions.shape[0] == 0:
        return np.empty((0, 6))

    # 转换坐标格式
    boxes = xywh2xyxy(predictions[:, :4])

    # 获取类别ID（关键修正）
    class_scores = predictions[:, 5:]
    class_ids = np.argmax(class_scores, axis=1).astype(int)  # 强制转换为整数

    # 组合最终结果 [x1, y1, x2, y2, conf, class_id]
    detections = np.concatenate([
        boxes,
        predictions[:, 4:5],  # 置信度
        class_ids[:, None].astype(int)  # 确保整数类型
    ], axis=1)

    return _nms_boxes(detections, iou_thres)




def _nms_boxes(detections, iou_threshold):
    """修正后的NMS核心函数"""
    if detections.size == 0:
        return np.empty((0, 6))

    x1 = detections[:, 0]
    y1 = detections[:, 1]
    x2 = detections[:, 2]
    y2 = detections[:, 3]
    scores = detections[:, 4]

    areas = (x2 - x1 + 1) * (y2 - y1 + 1)
    order = scores.argsort()[::-1]

    keep = []
    while order.size > 0:
        i = order[0]
        keep.append(i)

        xx1 = np.maximum(x1[i], x1[order[1:]])
        yy1 = np.maximum(y1[i], y1[order[1:]])
        xx2 = np.minimum(x2[i], x2[order[1:]])
        yy2 = np.minimum(y2[i], y2[order[1:]])

        w = np.maximum(0.0, xx2 - xx1 + 1)
        h = np.maximum(0.0, yy2 - yy1 + 1)
        inter = w * h
        ovr = inter / (areas[i] + areas[order[1:]] - inter)

        inds = np.where(ovr <= iou_threshold)[0]
        order = order[inds + 1]

    return detections[keep]

def scale_coords(img1_shape, coords, img0_shape, ratio_pad=None):
    # Rescale coords (xyxy) from img1_shape to img0_shape
    if ratio_pad is None:  # calculate from img0_shape
        gain = min(img1_shape[0] / img0_shape[0], img1_shape[1] / img0_shape[1])  # gain  = old / new
        pad = (img1_shape[1] - img0_shape[1] * gain) / 2, (img1_shape[0] - img0_shape[0] * gain) / 2  # wh padding
    else:
        gain = ratio_pad[0][0]
        pad = ratio_pad[1]

    coords[:, [0, 2]] -= pad[0]  # x padding
    coords[:, [1, 3]] -= pad[1]  # y padding
    coords[:, :4] /= gain
    clip_coords(coords, img0_shape)

    return coords

def clip_coords(boxes, shape):
    # Clip bounding xyxy bounding boxes to image shape (height, width)
    boxes[:, [0, 2]] = boxes[:, [0, 2]].clip(0, shape[1])
    boxes[:, [1, 3]] = boxes[:, [1, 3]].clip(0, shape[0])
    return boxes


def nms(box_out, conf_thres=0.4, iou_thres=0.5):
    return numpy_nms(box_out, conf_thres=conf_thres, iou_thres=iou_thres)
    #try:
     #   boxout = non_max_suppression(box_out, conf_thres=conf_thres, iou_thres=iou_thres, multi_label=True)
    #except:
    #    boxout = non_max_suppression(box_out, conf_thres=conf_thres, iou_thres=iou_thres)
    #return boxout

detect.py


#python detect.py --input ./test --img_output ./res_images --json_output ./res_json
import json
import cv2
import numpy as np
import glob
import os
from det_utils import letterbox, nms, scale_coords
from ais_bench.infer.interface import InferSession
from time import time
import argparse

model_path = "./best.om"  # om格式模型文件
label_path = './labels.txt'  # 标签

def calculate_iou(box1, boxes):
    """计算单个框与多个框的IoU"""
    # 计算交集区域
    x1 = np.maximum(box1[0, 0], boxes[:, 0])
    y1 = np.maximum(box1[0, 1], boxes[:, 1])
    x2 = np.minimum(box1[0, 2], boxes[:, 2])
    y2 = np.minimum(box1[0, 3], boxes[:, 3])

    intersection = np.maximum(0, x2 - x1) * np.maximum(0, y2 - y1)

    # 计算面积
    area_box1 = (box1[0, 2] - box1[0, 0]) * (box1[0, 3] - box1[0, 1])
    area_boxes = (boxes[:, 2] - boxes[:, 0]) * (boxes[:, 3] - boxes[:, 1])

    union = area_box1 + area_boxes - intersection
    return intersection / union

def preprocess_image(image, cfg, bgr2rgb=True):  # 图片预处理
    img, scale_ratio, pad_size = letterbox(image, new_shape=cfg['input_shape'])  # image尺度不定，故需调整尺寸适配模型输入
    if bgr2rgb:
        img = img[:, :, ::-1]
    img = img.transpose(2, 0, 1)  # HWC2CHW
    img = np.ascontiguousarray(img, dtype=np.float32)  # 将输入数组转换为连续存储数组，加速运算效率
    return img, scale_ratio, pad_size


def draw_bbox(bbox, img0, wt, names):
    """绘制不同颜色的预测框"""
    # 定义5种类别对应的BGR颜色（可根据需要修改）
    color_palette = [
        (0, 255, 0),  # 绿色 - 类别0
        (255, 0, 0),  # 蓝色 - 类别1
        (0, 0, 255),  # 红色 - 类别2
        (0, 255, 255),  # 黄色 - 类别3
        (255, 0, 255)  # 粉色 - 类别4
    ]

    det_result_str = ''
    for idx, class_id in enumerate(bbox[:, 5]):
        # 确保类别ID是整数
        class_id = int(round(float(class_id)))
        if class_id >= len(color_palette) or class_id < 0:
            print(f"Warning: 无效的类别ID {class_id}，使用默认颜色")
            color = (255, 255, 255)  # 白色作为默认
        else:
            color = color_palette[class_id]  # 根据类别选择颜色

        if float(bbox[idx][4]) < 0.05:
            continue


        # 绘制边界框
        x1, y1 = int(bbox[idx][0]), int(bbox[idx][1])
        x2, y2 = int(bbox[idx][2]), int(bbox[idx][3])
        img0 = cv2.rectangle(img0, (x1, y1), (x2, y2), color, wt)

        # 绘制类别标签（黑色背景白字）
        label = f"{names.get(class_id, 'unknown')} {bbox[idx][4]:.2f}"
        (tw, th), _ = cv2.getTextSize(label, cv2.FONT_HERSHEY_SIMPLEX, 0.5, 1)

        # 标签背景
        img0 = cv2.rectangle(img0, (x1, y1 - 20), (x1 + tw, y1), color, -1)
        # 标签文字
        img0 = cv2.putText(img0, label, (x1, y1 - 5),
                           cv2.FONT_HERSHEY_SIMPLEX, 1, (255, 255, 255), 2)

    return img0

def get_labels_from_txt(path):
    """从txt文件获取图片标签"""
    labels_dict = dict()
    with open(path) as f:
        for cat_id, label in enumerate(f.readlines()):
            labels_dict[cat_id] = label.strip()
    return labels_dict


def detect_img(model, detect_path, img_output_dir, json_output_dir, conf_thres=0.4, iou_thres=0.5):
    raw_img = cv2.imread(detect_path)  # 载入原始图片
    img_height, img_width = raw_img.shape[:2]# 获取图片原始尺寸
    labels = get_labels_from_txt(label_path)
    # 预处理
    cfg = {
        'conf_thres': conf_thres,  # 模型置信度阈值，阈值越低，得到的预测框越多
        'iou_thres': iou_thres,  # IOU阈值，重叠率过低的框会被过滤
        'input_shape': [640, 640],  # 输入尺寸
    }
    img, scale_ratio, pad_size = preprocess_image(raw_img, cfg)
    img = img / 255.0  # 训练模型时将0~255值域转化为了0~1，故推理阶段也需同样处理

    # 检测
    t1 = time()
    output = model.infer([img])[0]
    #output = torch.tensor(output)
    # 使用numpy实现的nms
    boxout = nms(output, conf_thres=cfg["conf_thres"], iou_thres=cfg["iou_thres"])
    if len(boxout) > 0:
        pred_all = boxout[0] if isinstance(boxout, list) else boxout
        scale_coords(cfg['input_shape'], pred_all[:, :4], raw_img.shape, ratio_pad=(scale_ratio, pad_size))
    else:
        pred_all = np.empty((0, 6))


    # 非极大值抑制后处理
    #boxout = nms(output, conf_thres=cfg["conf_thres"], iou_thres=cfg["iou_thres"])
    #pred_all = boxout[0].numpy()
    # 预测坐标转换
    #scale_coords(cfg['input_shape'], pred_all[:, :4], raw_img.shape, ratio_pad=(scale_ratio, pad_size))

    t2 = time()
    print("detect time: %fs" % (t2 - t1))
    # 准备JSON数据结构
    json_data = {
        "file_name": os.path.basename(detect_path),
        "detections": [],
        "image_size": {
            "width": img_width,
            "height": img_height
        }
    }
    # 结果绘制
    if pred_all.size > 0:
        draw_bbox(pred_all, raw_img,2, labels)
        for det in pred_all:
            x_min, y_min, x_max, y_max = map(int, det[:4])
            confidence = float(det[4])
            class_id = int(det[5])

            json_data["detections"].append({
                "disease_class": labels.get(class_id, "unknown"),
                "confidence": round(confidence, 4),
                "bbox": {
                    "x_min": x_min,
                    "y_min": y_min,
                    "x_max": x_max,
                    "y_max": y_max
                }
            })

            # 保存JSON到独立目录
        os.makedirs(json_output_dir, exist_ok=True)
        json_filename = os.path.basename(detect_path).rsplit('.', 1)[0] + ".json"
        json_path = os.path.join(json_output_dir, json_filename)
        with open(json_path, 'w') as f:
            json.dump(json_data, f, indent=2)

        # 保存图片到图片目录
    os.makedirs(img_output_dir, exist_ok=True)
    img_filename = "res_" + os.path.basename(detect_path)
    img_output_path = os.path.join(img_output_dir, img_filename)
    cv2.imwrite(img_output_path, raw_img, [int(cv2.IMWRITE_JPEG_QUALITY), 95])


def batch_detect(model, input_dir, img_output_dir, json_output_dir, conf_thres=0.4, iou_thres=0.5):
    """带参数传递的批量推理函数"""
    os.makedirs(json_output_dir, exist_ok=True)  # 自动创建JSON目录
    os.makedirs(img_output_dir, exist_ok=True)  # 自动创建图片目录
    # 扩展支持更多图像格式
    image_extensions = ['*.jpg', '*.jpeg', '*.png', '*.bmp', '*.tiff']
    image_paths = []
    for ext in image_extensions:
        image_paths.extend(sorted(glob.glob(os.path.join(input_dir, ext))))

    print(f"Found {len(image_paths)} pictures")
    for img_path in image_paths:
        detect_img(
            model=model,
            detect_path=img_path,
            img_output_dir=img_output_dir,
            json_output_dir=json_output_dir,
            conf_thres=conf_thres,
            iou_thres=iou_thres
        )


if __name__ == "__main__":
    # 创建参数解析器
    parser = argparse.ArgumentParser(description='目标检测批量推理脚本')
    parser.add_argument('--input', type=str, required=True, help='输入图片目录路径，支持jpg/png/bmp格式')
    parser.add_argument('--img_output', type=str, required=True, help='图片输出目录')
    parser.add_argument('--json_output', type=str, required=True, help='JSON输出目录')
    parser.add_argument('--conf', type=float, default=0.4, help='置信度阈值 (默认: 0.4)')
    parser.add_argument('--iou', type=float, default=0.5, help='IOU阈值 (默认: 0.5)')

    # 解析参数
    args = parser.parse_args()
    # 初始化模型
    model = InferSession(0, model_path)

    # 执行批量推理
    batch_detect(
        model=model,
        input_dir=args.input,
        img_output_dir=args.img_output,
        json_output_dir=args.json_output,
        conf_thres=args.conf,
        iou_thres=args.iou
    )
    print('检测完成！\n图片结果保存在: {}\nJSON结果保存在: {}'.format(os.path.abspath(args.img_output),os.path.abspath(args.json_output)))

示例命令：

python detect.py \
  --input ./try \
  --img_output ./res_images \
  --json_output ./res_json

YOLOv5配置训练以及华为昇腾910B推理