本博客来源于CSDN机器鱼,未同意任何人转载。
更多内容,欢迎点击本专栏,查看更多内容。
目录
0 引言
项目采用六轴机械臂搭配末端真空吸盘,从无序包裹中抓取想要的包裹。AI算法需要提供各包裹的抓取点的3D坐标与3D姿态。由于快递包裹含有多个面,且大多为倾斜状态,为了顺利抓取,我们的算法需要如下几个步骤:
①从彩色图的包裹堆里面识别、分割出每个单独的包裹,得到2D掩码Mask1;
②从对齐的深度图中提取每个包裹Mask1对应的深度掩码Mask2;
③根据相机参数,计算Mask2对应的三维点云DisMask;
④将DisMask中的异常点云值剔除后做平面分割,只要含点云数量最多的那个平面;
⑤求平面的质心作为抓取点,求平面的法向量与x、y、z轴的角度为末端抓取姿态。
最终得到6个值返回给下位机做机械臂的控制。
本次是在windows上进行开发,我用到的软件与SDK有:cmak-3.28.1,vs2019,opencv-4.6.1,open3d-0.18.0,orbbecsdk1.10.18、cuda-11.6、cudnn-8.5.0、onnxruntime-gpu-1.14.0(原本依旧用tensorrt的毕竟快,但是我之前已经出过yolov8-seg的tensorrt推理博客了,而yolov11-seg的可以直接用,再写一遍没啥意思,所以改成onnxruntime-gpu进行推理,实测比tensorrt差不到很多,还可以省掉onnx转tensorrt模型的步骤,巴适的板。我的PC显卡是3060,yolo11s-seg的推理速度是11ms,而onnxruntime的速度是24ms左右)。
模型用的ultralytics最新版训练的yolo11s-seg,训练好之后转好的onnx模型与训练数据可以在【这里】下载,训练方法网上到处都有,不是此博客的重点。
1 奥比中光相机测试
1.1 SDK相机初体验
以前做这种对精度要求不那么高项目用的都是intel的realsense系列,但价格越来越贵了,加上奥博中光天天网上吹牛批,所以某鱼买了一款二手的GeminiPro相机,是奥比中光与轮趣科技一同研发的。
首先配置奥比中光相机的驱动与SDK,必须要装驱动才能使用,下载地址:地址1。因为我采用的win11,所以下载的win_x64这个版本。
解压后的目录为D:/Project/OrbbecSDK/(这是我放置的目录),也可以换其他路径存放。有个driver的文件夹,里面的exe无脑安装即可。
安装完成后,打开CMAKE_GUI与VS2019,CMAKE我采用的版本是3.28.1。首先打开CMakeLists.txt将OpenCV_DIR的路径改成自己的实际路径,关于opencv的安装编译网上有很多,这里就不赘述了。
https://github.com/orbbec/OrbbecSDK/releases然后打开cmakegui按照下列步骤配置并生成vs的工程文件,生成好之后打开,最后ALL_BUILD。
ALL_BUILD右键生成之后会在bin目录下生成很多exe文件,插上相机、运行color_viewer有画面就代表已经可以正确使用SDK了,下面我们对其进行改写,得到我们想要的程序。
1.2 将SDK的Example改成所需要程序
在我的案例中,对于SDK需要有以下几个功能:
①设定参数与初始化相机;②获取彩色图与深度图;③对彩色图与深度图进行对齐;④传入检测得到的box与mask计算3d点云;⑤点云处理得到三维抓取点与三维姿态。
首先建立一个camera.h,定义需要的函数,代码如下:
#ifndef CAMERAUTIL_H
#define CAMERAUTIL_H
#include "libobsensor/ObSensor.hpp"
#include "libobsensor/hpp/Pipeline.hpp"
#include <Eigen/Dense>
#include <open3d/Open3D.h>
#include "opencv2/opencv.hpp"
#include <fstream>
#include <iostream>
#include <cmath>
#include <vector>
class Orbbec {
public:
bool Init(void);//初始化相机
bool GetRgbAndDepth(cv::Mat& src, cv::Mat& dep);//获取当前帧,并进行深度图与彩色图的对齐
void GetDistance(int x, int y);//打印提取某个点的x、y、z坐标
void GetDistance(int box[4], cv::Mat& mask, cv::Mat& dis_mask);//提取某个box中mask对应的所有点的三维点云,并组成dis_mask
void ProcessPointcloud(cv::Mat& dis_mask, cv::Mat imgcrop, std::vector<float>& result, float *ylength, float *xlength);//对点云进行后处理得到抓取坐标与抓取姿态
void Postprocess(int box[4], cv::Mat& mask, cv::Mat srcimg, float result[10]);//对提取的dis_mask进行后处理,GetDistance与ProcessPointcloud
//都是后处理需要用到的函数
void close(void);
private:
ob::Pipeline pipe;
OBCameraParam cameraParam;
cv::Mat depth;
float scale;
ob::Align* align=nullptr;
};
#endif
然后新建一个camera.cpp,orbbec的函数进行实现,代码如下:
#include "camera.h"
bool Orbbec::Init(void) {
// Query the list of connected devices
// Create a Context.
ob::Context ctx;
auto devList = ctx.queryDeviceList();
// Get the number of connected devices
if (devList->deviceCount() == 0) {
std::cerr << "Device not found!" << std::endl;
return false;
}
// Configure which streams to enable or disable for the Pipeline by creating a Config
std::shared_ptr<ob::Config> config = std::make_shared<ob::Config>();
// Get all stream profiles of the color camera, including stream resolution, frame rate, and frame format
auto colorProfiles = pipe.getStreamProfileList(OB_SENSOR_COLOR);
// Get the default color profile
//auto colorProfile = colorProfiles->getProfile(OB_PROFILE_DEFAULT);
std::shared_ptr<ob::VideoStreamProfile> colorProfile = nullptr;
colorProfile = colorProfiles->getVideoStreamProfile(640, 480, OB_FORMAT_RGB, 30);
config->enableStream(colorProfile);
// Get all stream profiles of the depth camera, including stream resolution, frame rate, and frame format
auto depthProfiles = pipe.getStreamProfileList(OB_SENSOR_DEPTH);
// Get the default depth profile
//auto depthProfile = depthProfiles->getProfile(OB_PROFILE_DEFAULT);
std::shared_ptr<ob::VideoStreamProfile> depthProfile = nullptr;
depthProfile = depthProfiles->getVideoStreamProfile(640, OB_HEIGHT_ANY, OB_FORMAT_Y11, 30);
config->enableStream(depthProfile);
// Configure the alignment mode as hardware D2C alignment
config->setAlignMode(ALIGN_D2C_HW_MODE);
align = new ob::Align(OB_STREAM_COLOR);
try {
pipe.start(config);
cameraParam = pipe.getCameraParam();
}
catch (ob::Error& e) {
std::cerr << "function:" << e.getName() << "\nargs:" << e.getArgs() << "\nmessage:" << e.getMessage() << "\ntype:" << e.getExceptionType() << std::endl;
return false;
}
return true;
};
bool Orbbec::GetRgbAndDepth(cv::Mat& src, cv::Mat& dep) {
auto frameset = pipe.waitForFrames(100);
if (frameset == nullptr || frameset->depthFrame() == nullptr || frameset->colorFrame() == nullptr) {
return false;
}
auto newFrame = align->process(frameset);
auto newFrameSet = newFrame->as<ob::FrameSet>();
auto colorFrame = newFrameSet->colorFrame();
auto depthFrame = newFrameSet->depthFrame();
uint32_t width = colorFrame->width();
uint32_t height = colorFrame->height();
cv::Mat rawMat(height, width, CV_8UC3, (uint16_t*)colorFrame->data(), cv::Mat::AUTO_STEP);
cv::cvtColor(rawMat, src, cv::COLOR_RGB2BGR);
width = depthFrame->width();
height = depthFrame->height();
scale = depthFrame->getValueScale();
uint16_t* data = (uint16_t*)depthFrame->data();
cv::Mat depimg(height, width, CV_16U, data, cv::Mat::AUTO_STEP);
depimg.copyTo(depth);
// 将深度图转换为8位图像,进行可视化
cv::Mat depth_image_8bit;
double minVal, maxVal;
cv::Point minLoc, maxLoc;
cv::minMaxLoc(depimg, &minVal, &maxVal, &minLoc, &maxLoc);
cv::convertScaleAbs(depimg, depth_image_8bit, 255.0 / maxVal);
cv::applyColorMap(depth_image_8bit, dep, cv::COLORMAP_JET);
//float centerDistance = data[width * height / 2 + width / 2] * scale;
return true;
};
void Orbbec::GetDistance(int x,int y) {
float Z = (float)depth.at<uint16_t>(y,x) * scale;
float X= (x - cameraParam.depthIntrinsic.cx) * Z / cameraParam.depthIntrinsic.fx;
float Y= (y - cameraParam.depthIntrinsic.cy) * Z / cameraParam.depthIntrinsic.fy;
printf("X,Y,Z:%f,%f.%f\n", X, Y, Z);
}
void Orbbec::GetDistance(int box[4], cv::Mat& mask, cv::Mat& dis_mask) {
for (int i = box[0]; i < box[2]; i++) {
for (int j = box[1]; j < box[3]; j++) {
if (mask.at<uchar>(j - box[1], i - box[0]) != 0){
float upoint[3];
upoint[2] = (float)depth.at<uint16_t>(j, i)* scale/1000.f;//转换成米
upoint[0] = (i - cameraParam.depthIntrinsic.cx) * upoint[2] / cameraParam.depthIntrinsic.fx;
upoint[1] = (j - cameraParam.depthIntrinsic.cy) * upoint[2] / cameraParam.depthIntrinsic.fy;
dis_mask.at<cv::Vec3f>(j - box[1], i - box[0]) = cv::Vec3f(upoint[0], upoint[1], upoint[2]);
if (upoint[2] == 0.0f)
{
mask.at<uchar>(j - box[1], i - box[0]) = 0; // 深度值为0的地方 mask也改成0
}
}
}
}
}
void Orbbec::Postprocess(int box[4], cv::Mat& mask, cv::Mat srcimg,float result[10]) {
cv::Rect r = cv::Rect(box[0], box[1], box[2] - box[0], box[3] - box[1]);
cv::Mat imgcrop = srcimg(r).clone();
mask = mask(r).clone();
cv::Mat dis_mask = cv::Mat::zeros(r.height,r.width, CV_32FC3);
GetDistance(box, mask, dis_mask);
//cv::Mat dis_show;
//cv::convertScaleAbs(dis_mask, dis_show, 255.0 / 1000);
//cv::applyColorMap(dis_show, dis_show, cv::COLORMAP_JET);
//cv::imshow("dismask", dis_show);
// 利用open3d进行点云的后处理
// segment plane分割平面,因为快递分割结果里面会含有侧面,因此先分割,取最大的那个面,然后提取这个面的中点作抓取点
// 以及该面的法向量
float ylength = 0, xlength = 0;
std::vector<float> Normals_and_Axis;
ProcessPointcloud(dis_mask, imgcrop, Normals_and_Axis, &ylength, &xlength);
// 通过抓取点反算彩色图上的抓取点,用来观察是否抓对点
float CX = Normals_and_Axis[3] * cameraParam.depthIntrinsic.fx / Normals_and_Axis[5] + cameraParam.depthIntrinsic.cx;
float CY = Normals_and_Axis[4] * cameraParam.depthIntrinsic.fy / Normals_and_Axis[5] + cameraParam.depthIntrinsic.cy;
// 通过点云算出来的包裹长宽来反算彩色图中的2D最小包围框,用于观察是否抓准平面
float Width = 0, Height = 0;
if (Normals_and_Axis[5] > 0) {//只计算有合理深度值的
//根据xlength与ylength,反算框的wh,与XY形成抓取面的2d最小包围框
Width = xlength * cameraParam.depthIntrinsic.fx / Normals_and_Axis[5];
Height = ylength * cameraParam.depthIntrinsic.fy / Normals_and_Axis[5];
}
for (int i = 0; i < 6; i++) {
result[i] = Normals_and_Axis[i];
}
result[6] = CX;
result[7] = CY;
result[8] = Width;
result[9] = Height;
}
void Orbbec::ProcessPointcloud(cv::Mat& dis_mask, cv::Mat image, std::vector<float>& result, float* ylength, float* xlength)
{
// //dis_mask to pointcloud
auto outputPC = std::make_shared<open3d::geometry::PointCloud>();
int num_points = dis_mask.cols * dis_mask.rows;
for (int i = 0; i < dis_mask.rows; i++)
{
for (int j = 0; j < dis_mask.cols; j++)
{
Eigen::Vector3d pc = { dis_mask.at<cv::Vec3f>(i, j)[0], dis_mask.at<cv::Vec3f>(i, j)[1], dis_mask.at<cv::Vec3f>(i, j)[2] };
Eigen::Vector3d clr = { (float)image.at<cv::Vec3b>(i,j)[2] / 255.0,
(float)image.at<cv::Vec3b>(i,j)[1] / 255.0,
(float)image.at<cv::Vec3b>(i,j)[0] / 255.0 };
outputPC->points_.push_back(pc);
outputPC->colors_.push_back(clr);
}
}
//open3d::visualization::DrawGeometries({ outputPC}, "outputPC", 800, 450);
//异常值剔除,剔除z距离为0的无效点云
std::vector<size_t> indices;
for (size_t i = 0; i < outputPC->points_.size(); i++) {
if (outputPC->points_[i][2] == 0.0) {
indices.push_back(i);
}
}
outputPC = outputPC->SelectByIndex(indices, true);
if (outputPC->points_.size() <= 3) {
result = { 0,0,0,0,0,0,0 };
printf("points size too small\n");
return;
}
// 平面分割 取出来最大的平面
auto start = std::chrono::system_clock::now();
double tDis = 0.01; // 邻域距离
int minNum = 3; // 最小点云数
int numIter = 200; // 迭代数
std::tuple<Eigen::Vector4d, std::vector<size_t>> vRes = outputPC->SegmentPlane(tDis, minNum, numIter);
// 平面参数
// AX+BY+CZ+D=0 ==> Z=-(AX+BY+D)/C
Eigen::Vector4d para = std::get<0>(vRes);
//printf("Segment plane:%.3fX+%.3fY+%.3fZ+%.3f=0\n",para[0],para[1],para[2],para[3]);
// 内点索引
std::vector<size_t> selectedIndex = std::get<1>(vRes);
std::shared_ptr<open3d::geometry::PointCloud> inPC = outputPC->SelectByIndex(selectedIndex, false);
//open3d::visualization::DrawGeometries({ inPC}, "inPC", 800, 450);
// 点云质心作为距离坐标
Eigen::Vector3d Axis;
Axis = inPC->GetCenter();
// 计算法向量z夹角
Eigen::Vector3d A(para[0], para[1], para[2]), X(-1, 0, 0), Y(0, -1, 0), Z(0, 0, 1);
double angleX0, angleX, angleY, angleZ;
double PI = 3.1415926;
angleZ = std::acos(A.dot(Z) / (A.norm() * Z.norm())) * 180 / PI;
angleX0 = std::acos(A.dot(X) / (A.norm() * X.norm())) * 180 / PI;
angleY = std::acos(A.dot(Y) / (A.norm() * Y.norm())) * 180 / PI;
angleX = std::atan2(A[1], A[0]) * 180 / PI;
if (angleZ > 90) {
angleZ = 180 - angleZ;
X = { 1, 0, 0 };
Y = { 0, 1, 0 };
angleX0 = std::acos(A.dot(X) / (A.norm() * X.norm())) * 180 / PI;
angleY = std::acos(A.dot(Y) / (A.norm() * Y.norm())) * 180 / PI;
angleX = std::atan2(-A[1], A[0]) * 180 / PI;
}
//printf("angle:%f,%f,%f,xplane:%f\n", angleX0, angleY, angleZ, angleX);
//angleX的角度是与法向量投影到xoy平面的并于x负方向的夹角,顺时针是0-180,逆时针是0-负180
//
result.push_back(angleX0);
result.push_back(angleY);
result.push_back(angleZ);
result.push_back(Axis[0]);
result.push_back(Axis[1]);
result.push_back(Axis[2]);
auto end = std::chrono::system_clock::now();
//std::cout << "open3d process time: " << std::chrono::duration_cast<std::chrono::milliseconds>(end - start).count() << "ms" << std::endl;
// // 绘制分割出来的平面模型
Eigen::Vector3d min_pt, max_pt;
min_pt = inPC->GetMinBound();
max_pt = inPC->GetMaxBound();
double Xmin = min_pt[0], Ymin = min_pt[1], Zmin = min_pt[2];
double Xmax = max_pt[0], Ymax = max_pt[1], Zmax = max_pt[2];
*ylength = (float)Ymax - (float)Ymin;
*xlength = (float)Xmax - (float)Xmin;
//展示结果
if (0) {
// 外点 绿色
std::shared_ptr<open3d::geometry::PointCloud> outPC = outputPC->SelectByIndex(selectedIndex, true);
const Eigen::Vector3d colorOut = { 0,1,0 };
outPC->PaintUniformColor(colorOut);
// open3d::visualization::DrawGeometries({ outPC}, "outPC", 800, 450);
auto Plane = std::make_shared<open3d::geometry::PointCloud>();
int width = 20, height = 20;
for (int i = 0; i < width; i++) {
double x = Xmin + (Xmax - Xmin) / width * i;
for (int j = 0; j < height; j++) {
double y = Ymin + (Ymax - Ymin) / height * j;
double z = -(para[0] * x + para[1] * y + para[3]) / para[2];
Eigen::Vector3d pc = { x, y, z };
Plane->points_.push_back(pc);
}
}
const Eigen::Vector3d color = { 1,0,1 };
Plane->PaintUniformColor(color);
// 夹爪方向
auto lineCloud = std::make_shared<open3d::geometry::PointCloud>();
lineCloud->points_.resize(20);
for (size_t i = 0; i < lineCloud->points_.size(); ++i)
{
lineCloud->points_[i] = Axis - i * 0.01 * A;
}
lineCloud->PaintUniformColor(color);
// 抓取点
auto mesh = open3d::geometry::TriangleMesh::CreateSphere(0.01);
mesh->ComputeVertexNormals();
mesh->PaintUniformColor(Eigen::Vector3d(0.1, 0.1, 0.1));
mesh->Translate(Axis);
//圆心坐标
auto mesh1 = open3d::geometry::TriangleMesh::CreateCoordinateFrame(0.1);
mesh1->ComputeVertexNormals();
//旋转到相机方向 方便我们观看
Eigen::Matrix4d transform;
transform << 1.0, 0.0, 0.0, 0.0,
0.0, -1.0, 0.0, 0.0,
0.0, 0.0, -1.0, 0.0,
0.0, 0.0, 0.0, 1.0;
inPC->Transform(transform);
outPC->Transform(transform);
lineCloud->Transform(transform);
Plane->Transform(transform);
mesh->Transform(transform);
mesh1->Transform(transform);
// open3d::io::WritePointCloudToPCD("inPC.pcd", *inPC,false);
// open3d::io::WritePointCloudToPCD("outPC.pcd", *outPC,false);
// open3d::io::WritePointCloudToPCD("lineCloud.pcd", *lineCloud,false);
// open3d::io::WritePointCloudToPCD("Plane.pcd", *Plane,false);
open3d::visualization::DrawGeometries({ inPC,outPC,lineCloud,Plane,mesh,mesh1 }, "Result", 800, 450);
}
}
void Orbbec::close(void) {
pipe.stop();
};
再新建一个主函数,用来调用上面的函数,名字就叫main1_testcamera.cpp写,主体代码如下所示:
#include "camera.h"
// #include "opencv2/opencv.hpp"
#include <fstream>
#include <iostream>
#include <cmath>
#include <conio.h>
using namespace std;
int main(int argc, char** argv) try {
Orbbec camera;
camera.Init();
while (true) {
cv::Mat src, dep;
if (!camera.GetRgbAndDepth(src, dep)) {
continue;
}
cv::Mat show=src.clone();
//cv::addWeighted(src,0.5,dep,0.8,0.5,show);
int box[4] = { 280,200,360,280};//检测框的xmin,ymin,xmax,ymax
cv::Mat mask = cv::Mat::ones(cv::Size(src.cols, src.rows), CV_8UC1);
float result[10] = {0};
camera.Postprocess(box, mask, src, result);
printf("ax,ay,az,x,y,z,:%f,%f,%f,%f,%f,%f\n", result[0], result[1], result[2], result[3], result[4], result[5]);
cv::Point p1((int)(0.5 * box[0] + 0.5 * box[2]), (int)(0.5 * box[1] + 0.5 * box[3]));
cv::circle(show, p1, 3, cv::Scalar(0, 255, 255), 3);//center point
cv::Point p2((int)result[6], (int)result[7]);
cv::circle(show, p2, 3, cv::Scalar(0, 0, 255), 3);//pick point
cv::Rect rdraw0 = cv::Rect((int)box[0], (int)box[1], (int)(box[2]-box[0]), (int)(box[3] - box[1]));
cv::rectangle(show, rdraw0, cv::Scalar(0, 255, 255), 2);//detect box
cv::Rect rdraw1 = cv::Rect((int)(result[6] - 0.5 * result[8]),(int)(result[7] - 0.5 * result[9]),
(int)result[8], (int)result[9]);
cv::rectangle(show, rdraw1, cv::Scalar(0, 0, 255), 2);//postprocess box
printf("-------------------------------------------------\n");
cv::imshow("show", show);
char c = cv::waitKey(1);
if (c == 27) {
cv::destroyAllWindows();
break;
}
}
camera.close();
return 0;
}
catch (ob::Error& e) {
std::cerr << "function:" << e.getName() << "\nargs:" << e.getArgs() << "\nmessage:" << e.getMessage() << "\ntype:" << e.getExceptionType() << std::endl;
exit(EXIT_FAILURE);
}
最后,新建一个CMakeLists.txt配置我们的环境,如下所示,需要根据实际情况修改OpenCV_DIR、OrbbecSDK_ROOT_DIR、Open3D_DIR,
cmake_minimum_required(VERSION 3.10)
project(orbbec)
add_definitions(-std=c++11)
add_definitions(-DAPI_EXPORTS)
set(CMAKE_CXX_STANDARD 11)
set(CMAKE_BUILD_TYPE Release)
set(CMAKE_RUNTIME_OUTPUT_DIRECTORY ${PROJECT_SOURCE_DIR}/bin)
include_directories(include/)
link_directories(lib/) #添加特定的库文件搜索路径 相当于g++的 -L参数
include_directories(${PROJECT_SOURCE_DIR}/src/)
file(GLOB_RECURSE SRCS ${PROJECT_SOURCE_DIR}/src/*.cpp)
#opencv
Set(OpenCV_DIR D:/Project/opencv/build/)
find_package(OpenCV)
include_directories(${OpenCV_INCLUDE_DIRS})
include_directories(${OpenCV_DIR}/include/)
link_directories(${OpenCV_DIR}/x64/vc15/lib/)
#orbbec
set(OrbbecSDK_ROOT_DIR D:/Project/OrbbecSDK/SDK)
set(OrbbecSDK_LIBRARY_DIRS ${OrbbecSDK_ROOT_DIR}/lib)
set(OrbbecSDK_INCLUDE_DIR ${OrbbecSDK_ROOT_DIR}/include)
include_directories(${OrbbecSDK_INCLUDE_DIR})
link_directories(${OrbbecSDK_LIBRARY_DIRS})
#open3d
Set(Open3D_DIR D:/Project/open3d-0.18.0/CMake/)
find_package( Open3D REQUIRED)
include_directories(${Open3D_INCLUDE_DIRS})
link_directories(${Open3D_LIBRARY_DIRS})
add_executable(test main1_test_camera.cpp ${SRCS})
target_link_libraries(test OrbbecSDK ${OpenCV_LIBS})
target_link_libraries(test ${Open3D_LIBRARIES})
target_include_directories(test PUBLIC ${Open3D_INCLUDE_DIRS})
最后,利用Cmake-gui与vs2019生成可执行文件即可,运行程序时候将这3个dll文件放在test.exe的同级目录即可。
2 YOLO11S-SEG的onnxruntime推理
直接上代码,首先建立Segment.h,用来定义推理用到的函数,如下
#ifndef SEGMENT_H
#define SEGMENT_H
#include <fstream>
#include <sstream>
#include <iostream>
#include <opencv2/imgproc.hpp>
#include <opencv2/highgui.hpp>
//#include <cuda_provider_factory.h>
#include <onnxruntime_cxx_api.h>
#include <chrono>
using namespace std;
using namespace Ort;
struct BoxInfo
{
float x;
float y;
float w;
float h;
float score;
int label;
float mask[32];
};
struct Net_config
{
float confThreshold; // Confidence threshold
float nmsThreshold; // Non-maximum suppression threshold
std::string modelpath;
std::string labelpath;
};
constexpr static int num_class = 1;//我们只有一类就是1,多类就要改这儿
const static int kOutputSize = num_class + 4 + 32;
class YOLOV8
{
public:
YOLOV8(Net_config config);//加载模型
void preprocess_img(cv::Mat& img, cv::Mat& dis, int input_w, int input_h);//将图像letterbox resize到640*640
void normalize_(cv::Mat img);//除以255,归一化,同时将rgbrgbrgb转成rrrgggbbb
void detect(cv::Mat& frame, std::vector<BoxInfo>& generate_boxes, std::vector<cv::Mat>& masks);//输入图像进行检测与后处理
//得到每个目标的检测框与每个目标的mask1
float iou(float lbox[4], float rbox[4]);//iou计算
void nms(vector<BoxInfo>& input_boxes);//nms
std::vector<cv::Mat> process_mask(const float* proto, int proto_size, std::vector<BoxInfo>& dets, cv::Mat img);//后处理得到mask
cv::Rect get_downscale_rect(float bbox[4], float scale);//将640*640输入的box尺寸下采样到160*160的特征图上
cv::Mat scale_mask(cv::Mat mask, cv::Mat img); // 将640 * 640输入得到的掩码还原到原图上, 很好理解,就跟processimg反着操作
cv::Rect get_rect(cv::Mat& img, float bbox[4]);// 将640 * 640输入得到的box还原到原图上, 很好理解,就跟processimg反着操作
private:
int inpWidth;
int inpHeight;
int nout;
int num_proposal;
std::vector<string> class_names;
float confThreshold;
float nmsThreshold;
std::vector<float> input_image_;//ONNXRUMTIME接收的数据是一个vector
Env env = Env(ORT_LOGGING_LEVEL_ERROR, "YOLOV8");
Ort::Session* ort_session = nullptr;
SessionOptions sessionOptions = SessionOptions();
std::vector<char*> input_names;
std::vector<char*> output_names;
std::vector<vector<int64_t>> input_node_dims; // >=1 outputs
std::vector<vector<int64_t>> output_node_dims; // >=1 outputs
const std::vector<uint32_t> colors = { 0xFF3838, 0xFF9D97, 0xFF701F, 0xFFB21D, 0xCFD231, 0x48F90A,
0x92CC17, 0x3DDB86, 0x1A9334, 0x00D4BB, 0x2C99A8, 0x00C2FF,
0x344593, 0x6473FF, 0x0018EC, 0x8438FF, 0x520085, 0xCB38FF,
0xFF95C8, 0xFF37C7 };
};
#endif
然后新建Segment.cpp开具体实现这些函数,代码如下:
#include "Segment.h"
YOLOV8::YOLOV8(Net_config config)
{
confThreshold = config.confThreshold;
nmsThreshold = config.nmsThreshold;
string classesFile = config.labelpath;
string model_path = config.modelpath;
std::wstring widestr = std::wstring(model_path.begin(), model_path.end());
OrtStatus* status = OrtSessionOptionsAppendExecutionProvider_CUDA(sessionOptions, 0);
sessionOptions.SetGraphOptimizationLevel(ORT_ENABLE_BASIC);
ort_session = new Session(env, widestr.c_str(), sessionOptions);
size_t numInputNodes = ort_session->GetInputCount();
size_t numOutputNodes = ort_session->GetOutputCount();
AllocatorWithDefaultOptions allocator;
for (int i = 0; i < numInputNodes; i++)
{
AllocatedStringPtr input_name_Ptr = ort_session->GetInputNameAllocated(i, allocator);
input_names.push_back(input_name_Ptr.get());
Ort::TypeInfo input_type_info = ort_session->GetInputTypeInfo(i);
auto input_tensor_info = input_type_info.GetTensorTypeAndShapeInfo();
auto input_dims = input_tensor_info.GetShape();
input_node_dims.push_back(input_dims);
}
for (int i = 0; i < numOutputNodes; i++)
{
AllocatedStringPtr output_name_Ptr = ort_session->GetOutputNameAllocated(i, allocator);
output_names.push_back(output_name_Ptr.get());
Ort::TypeInfo output_type_info = ort_session->GetOutputTypeInfo(i);
auto output_tensor_info = output_type_info.GetTensorTypeAndShapeInfo();
auto output_dims = output_tensor_info.GetShape();
output_node_dims.push_back(output_dims);
}
inpHeight = input_node_dims[0][2];
inpWidth = input_node_dims[0][3];
nout = output_node_dims[0][1];
num_proposal = output_node_dims[0][2];
ifstream ifs(classesFile.c_str());
string line;
while (getline(ifs, line)) class_names.push_back(line);
printf("classsize is:%d\n", num_class);
assert(num_class = class_names.size());
}
cv::Rect YOLOV8::get_rect(cv::Mat& img, float bbox[4]) {
float l, r, t, b;
float r_w = inpWidth / (img.cols * 1.0);
float r_h = inpHeight / (img.rows * 1.0);//inpHeight inpWidth
if (r_h > r_w) {
l = bbox[0] - bbox[2] / 2.f;
r = bbox[0] + bbox[2] / 2.f;
t = bbox[1] - bbox[3] / 2.f - (inpHeight - r_w * img.rows) / 2;
b = bbox[1] + bbox[3] / 2.f - (inpHeight - r_w * img.rows) / 2;
l = l / r_w;
r = r / r_w;
t = t / r_w;
b = b / r_w;
}
else {
l = bbox[0] - bbox[2] / 2.f - (inpWidth - r_h * img.cols) / 2;
r = bbox[0] + bbox[2] / 2.f - (inpWidth - r_h * img.cols) / 2;
t = bbox[1] - bbox[3] / 2.f;
b = bbox[1] + bbox[3] / 2.f;
l = l / r_h;
r = r / r_h;
t = t / r_h;
b = b / r_h;
}
return cv::Rect(round(l), round(t), round(r - l), round(b - t));
}
cv::Rect YOLOV8::get_downscale_rect(float bbox[4], float scale) {
float left = bbox[0] - bbox[2] / 2;
float top = bbox[1] - bbox[3] / 2;
float right = bbox[0] + bbox[2] / 2;
float bottom = bbox[1] + bbox[3] / 2;
left /= scale;
top /= scale;
right /= scale;
bottom /= scale;
return cv::Rect(round(left), round(top), round(right - left), round(bottom - top));
}
cv::Mat YOLOV8::scale_mask(cv::Mat mask, cv::Mat img) {
int x, y, w, h;
float r_w = inpWidth / (img.cols * 1.0);
float r_h = inpHeight / (img.rows * 1.0);
if (r_h > r_w) {
w = inpWidth;
h = r_w * img.rows;
x = 0;
y = (inpHeight - h) / 2;
}
else {
w = r_h * img.cols;
h = inpHeight;
x = (inpWidth - w) / 2;
y = 0;
}
cv::Rect r(x, y, w, h);
cv::Mat res;
cv::resize(mask(r), res, img.size());
return res;
}
std::vector<cv::Mat> YOLOV8::process_mask(const float* proto, int proto_size, std::vector<BoxInfo>& dets, cv::Mat img) {
std::vector<cv::Mat> masks;
//proto 32x(H/4)x(W/4)
//auto start = std::chrono::system_clock::now();
for (size_t i = 0; i < dets.size(); i++) {
cv::Mat mask_mat = cv::Mat::zeros(inpHeight / 4, inpWidth / 4, CV_8UC1);//160*160
float bbox[4] = { dets[i].x, dets[i].y, dets[i].w, dets[i].h };
auto r = get_downscale_rect(bbox, 4);//将640*640输入的box尺寸下采样到160*160的特征图上
for (int x = r.x; x < r.x + r.width; x++) {
for (int y = r.y; y < r.y + r.height; y++) {
float e = 0.0f;
for (int j = 0; j < 32; j++) {
//1x32 x 1*32*160*160 =1*160*160 可以理解成160*160特征图上的每一个像素有32个值,要与mask的32个值分别相乘再相加
//而1*32*160*160在内存中的存储方式是160*160 160*160 160*160 .... 因此每次乘法操作的时候都要跳过160*160个内存空间
e += dets[i].mask[j] * proto[j * proto_size / 32 + y * mask_mat.cols + x];
}
e = 1.0f / (1.0f + expf(-e));
if (e > 0.5) {
mask_mat.at<uchar>(y, x) = 1;
}
}
}
//160*160的掩码resize到640*640上
cv::resize(mask_mat, mask_mat, cv::Size(inpWidth, inpHeight), 0, 0, 1);
//再将640*640的掩码还原到原图上,很好理解,就跟processimg反着操作
cv::Mat img_mask = scale_mask(mask_mat, img);
//将640*640的box还原到原图上,很好理解,就跟processimg反着操作
r = get_rect(img, bbox);
cv::Rect holeImgRect(0, 0, img.cols, img.rows);
r = r & holeImgRect;
dets[i].x = r.x;
dets[i].y = r.y;
dets[i].w = r.width;
dets[i].h = r.height;
masks.push_back(img_mask);
}
return masks;
}
void YOLOV8::normalize_(cv::Mat img)
{
int row = img.rows;
int col = img.cols;
input_image_.resize(row * col * img.channels());
// bgrbgrbgr to rrrgggbbb
for (int c = 0; c < 3; c++)
{
for (int i = 0; i < row; i++)
{
for (int j = 0; j < col; j++)
{
float pix = img.ptr<uchar>(i)[j * 3 + 2 - c];
input_image_[c * row * col + i * col + j] = pix / 255.0;
}
}
}
}
float YOLOV8::iou(float lbox[4], float rbox[4]) {
float interBox[] = {
(std::max)(lbox[0] - lbox[2] / 2.f , rbox[0] - rbox[2] / 2.f), //left
(std::min)(lbox[0] + lbox[2] / 2.f , rbox[0] + rbox[2] / 2.f), //right
(std::max)(lbox[1] - lbox[3] / 2.f , rbox[1] - rbox[3] / 2.f), //top
(std::min)(lbox[1] + lbox[3] / 2.f , rbox[1] + rbox[3] / 2.f), //bottom
};
if (interBox[2] > interBox[3] || interBox[0] > interBox[1])
return 0.0f;
float interBoxS = (interBox[1] - interBox[0]) * (interBox[3] - interBox[2]);
return interBoxS / (lbox[2] * lbox[3] + rbox[2] * rbox[3] - interBoxS);
}
void YOLOV8::nms(vector<BoxInfo>& input_boxes)
{
sort(input_boxes.begin(), input_boxes.end(), [](BoxInfo a, BoxInfo b) { return a.score > b.score; });
vector<bool> isSuppressed(input_boxes.size(), false);
for (int i = 0; i < int(input_boxes.size()); ++i)
{
if (isSuppressed[i]) { continue; }
for (int j = i + 1; j < int(input_boxes.size()); ++j)
{
if (isSuppressed[j]) { continue; }
float lbox[4] = { input_boxes.at(i).x,input_boxes.at(i).y,input_boxes.at(i).w,input_boxes.at(i).h };
float rbox[4] = { input_boxes.at(j).x,input_boxes.at(j).y,input_boxes.at(j).w,input_boxes.at(j).h };
float ovr = iou(lbox, rbox);
if (ovr >= nmsThreshold)
{
isSuppressed[j] = true;
}
}
}
int idx_t = 0;
input_boxes.erase(remove_if(input_boxes.begin(), input_boxes.end(), [&idx_t, &isSuppressed](const BoxInfo& f) { return isSuppressed[idx_t++]; }), input_boxes.end());
}
void YOLOV8::preprocess_img(cv::Mat& img, cv::Mat& dis, int input_w, int input_h) {
int w, h, x, y;
float r_w = input_w / (img.cols * 1.0);
float r_h = input_h / (img.rows * 1.0);
if (r_h > r_w) {//宽大于高
w = input_w;
h = r_w * img.rows;
x = 0;
y = (input_h - h) / 2;
}
else {
w = r_h * img.cols;
h = input_h;
x = (input_w - w) / 2;
y = 0;
}
cv::Mat re(h, w, CV_8UC3);
cv::resize(img, re, re.size(), 0, 0, cv::INTER_LINEAR);
cv::Mat out(input_h, input_w, CV_8UC3, cv::Scalar(128, 128, 128));
re.copyTo(out(cv::Rect(x, y, re.cols, re.rows)));
out.copyTo(dis);
}
void YOLOV8::detect(cv::Mat& frame, vector<BoxInfo>& generate_boxes,std::vector<cv::Mat>& masks)
{
cv::Mat dstimg;
preprocess_img(frame, dstimg, inpWidth, inpHeight);
normalize_(dstimg);
array<int64_t, 4> input_shape_{ 1, 3, inpHeight, inpWidth };
auto allocator_info = MemoryInfo::CreateCpu(OrtDeviceAllocator, OrtMemTypeCPU);
Value input_tensor_ = Value::CreateTensor<float>(allocator_info, input_image_.data(), input_image_.size(), input_shape_.data(), input_shape_.size());
// 开始推理
const std::array<const char*, 1> inputNames = { "images" };
const std::array<const char*, 2> outNames = { "output0","output1" };
vector<Value> ort_outputs = ort_session->Run(Ort::RunOptions{ nullptr }, inputNames.data(), &input_tensor_, 1, outNames.data(), outNames.size());
/generate proposals
//vector<BoxInfo> generate_boxes;
int n = 0, k = 0;
const float* pdata = ort_outputs[0].GetTensorMutableData<float>();// 1 * 116 *8400
const float* pdata1 = ort_outputs[1].GetTensorMutableData<float>();//1*32*160*160 mask propsal
for (n = 0; n < num_proposal; n++)
{
float out[kOutputSize];//4+class+32
float mask[32];
for (int j = 0; j < nout; j++) {
out[j] = pdata[n + num_proposal * j];
}
memcpy(&mask, &out[4 + num_class], 32 * sizeof(float));
int max_ind = 0;
float max_class_socre = 0;
// 计算最大的置信度得分
for (k = 0; k < num_class; k++)
{
if (out[k + 4] > max_class_socre)
{
max_class_socre = out[k + 4];
max_ind = k;
}
}
if (max_class_socre > confThreshold)
{
float cx = out[0];
float cy = out[1];
float w = out[2];
float h = out[3];
BoxInfo det;
det.x = cx;
det.y = cy;
det.w = w;
det.h = h;
det.score = max_class_socre;
det.label = max_ind;
for (int jj = 0; jj < 32; jj++) {
det.mask[jj] = mask[jj];
}
generate_boxes.push_back(det);
}
}
nms(generate_boxes);
int kOutputSize2 = 32 * (inpHeight / 4) * (inpWidth / 4);
masks = process_mask(pdata1, kOutputSize2, generate_boxes, frame);
// draw at original frame
for (size_t i = 0; i < generate_boxes.size(); i++) {
cv::Mat img_mask = masks[i];
auto color = colors[(int)generate_boxes[i].label % colors.size()];
auto bgr = cv::Scalar(color & 0xFF, color >> 8 & 0xFF, color >> 16 & 0xFF);
cv::Rect r = cv::Rect((int)generate_boxes[i].x, (int)generate_boxes[i].y, (int)generate_boxes[i].w, (int)generate_boxes[i].h);
cv::rectangle(frame, r, bgr, 2);
// mask 画原图上
for (int x = r.x; x < r.x + r.width; x++) {
for (int y = r.y; y < r.y + r.height; y++) {
float val = img_mask.at<uchar>(y, x);
if (val == 0) continue;
frame.at<cv::Vec3b>(y, x)[0] = frame.at<cv::Vec3b>(y, x)[0] / 2 + bgr[0] / 2;
frame.at<cv::Vec3b>(y, x)[1] = frame.at<cv::Vec3b>(y, x)[1] / 2 + bgr[1] / 2;
frame.at<cv::Vec3b>(y, x)[2] = frame.at<cv::Vec3b>(y, x)[2] / 2 + bgr[2] / 2;
}
}
char label[100];
sprintf(label, "%s:%.2f", class_names[generate_boxes[i].label], generate_boxes[i].score);
cv::putText(frame, label, cv::Point(r.x, r.y - 4), cv::FONT_HERSHEY_PLAIN, 1.2, cv::Scalar(128, 128, 255), 2);
}
}
然后再建立一个主程序,叫main.cpp用来调用这些程序,代码如下:
#include <iostream>
#include <chrono>
#include <cmath>
#include <assert.h>
#include "Segment.h"
#include "camera.h"
#include <stdio.h>
using namespace std;
int main(int argc, char** argv) {
Net_config YOLOV8_nets = { 0.9, 0.5,
"D:/Project/inference/Orbbec_example/model/kuaidi.onnx" ,
"D:/Project/inference/Orbbec_example/model/kuaidi.names" };
YOLOV8 net(YOLOV8_nets);
Orbbec camera;
camera.Init();
while (1){
cv::Mat src, dep;
if (!camera.GetRgbAndDepth(src, dep)) {
continue;
}
cv::Mat show = src.clone();
// Run inference
auto start = std::chrono::system_clock::now();
std::vector<BoxInfo> res;
std::vector<cv::Mat> masks;
net.detect(src, res, masks);
auto end = std::chrono::system_clock::now();
std::cout << "inference time: " << std::chrono::duration_cast<std::chrono::milliseconds>(end - start).count() << "ms" << std::endl;
if (res.size()==0){printf("no target detect\n");}
cv::imshow("reult", src);
char c = cv::waitKey(1);
for (int i = 0; i < res.size(); i++){
int box[4] = { (int)res[i].x, (int)res[i].y,(int)(res[i].x + res[i].w),(int)(res[i].y + res[i].h) };
cv::Mat mask = masks[i].clone();
float result[10] = { 0 };
camera.Postprocess(box, mask, src, result);
cv::Point p1((int)(0.5 * box[0] + 0.5 * box[2]), (int)(0.5 * box[1] + 0.5 * box[3]));
cv::circle(show, p1, 3, cv::Scalar(0, 255, 255), 3);//center point
cv::Point p2((int)result[6], (int)result[7]);
cv::circle(show, p2, 3, cv::Scalar(0, 0, 255), 3);//pick point
cv::Rect rdraw0 = cv::Rect((int)box[0], (int)box[1], (int)(box[2] - box[0]), (int)(box[3] - box[1]));
cv::rectangle(show, rdraw0, cv::Scalar(0, 255, 255), 2);//detect box
cv::Rect rdraw1 = cv::Rect((int)(result[6] - 0.5 * result[8]), (int)(result[7] - 0.5 * result[9]),
(int)result[8], (int)result[9]);
cv::rectangle(show, rdraw1, cv::Scalar(0, 0, 255), 2);//postprocess box
}
printf("-------------------------------------------------\n");
cv::imshow("show", show);
c = cv::waitKey(1);
if (c == 27) {
cv::destroyAllWindows();
break;
}
}
camera.close();
printf("exit prgram\n");
return 0;
}
改写以下CMakeLists.txt,使其可以同时使用1.2与这一章节,如下:
cmake_minimum_required(VERSION 3.10)
project(orbbec)
add_definitions(-std=c++11)
add_definitions(-DAPI_EXPORTS)
set(CMAKE_CXX_STANDARD 11)
set(CMAKE_BUILD_TYPE Release)
set(CMAKE_RUNTIME_OUTPUT_DIRECTORY ${PROJECT_SOURCE_DIR}/bin)
include_directories(include/)
link_directories(lib/) #添加特定的库文件搜索路径 相当于g++的 -L参数
include_directories(${PROJECT_SOURCE_DIR}/src/)
file(GLOB_RECURSE SRCS ${PROJECT_SOURCE_DIR}/src/*.cpp)
#opencv
Set(OpenCV_DIR D:/Project/opencv/build/)
find_package(OpenCV)
include_directories(${OpenCV_INCLUDE_DIRS})
include_directories(${OpenCV_DIR}/include/)
link_directories(${OpenCV_DIR}/x64/vc15/lib/)
#orbbec
set(OrbbecSDK_ROOT_DIR D:/Project/OrbbecSDK/SDK)
set(OrbbecSDK_LIBRARY_DIRS ${OrbbecSDK_ROOT_DIR}/lib)
set(OrbbecSDK_INCLUDE_DIR ${OrbbecSDK_ROOT_DIR}/include)
include_directories(${OrbbecSDK_INCLUDE_DIR})
link_directories(${OrbbecSDK_LIBRARY_DIRS})
#open3d
Set(Open3D_DIR D:/Project/open3d-0.18.0/CMake/)
find_package( Open3D REQUIRED)
include_directories(${Open3D_INCLUDE_DIRS})
link_directories(${Open3D_LIBRARY_DIRS})
#onnxruntinme
Set(ONNXRUNTIME_ROOT_DIR D:/Project/onnxruntime-win-x64-gpu-1.14.0/)
include_directories(${ONNXRUNTIME_ROOT_DIR}/include/)
link_directories(${ONNXRUNTIME_ROOT_DIR}/lib/)
add_executable(test main1_test_camera.cpp ${SRCS})
target_link_libraries(test OrbbecSDK ${OpenCV_LIBS})
target_link_libraries(test ${Open3D_LIBRARIES})
target_include_directories(test PUBLIC ${Open3D_INCLUDE_DIRS})
target_link_libraries(test onnxruntime)
add_executable(main main.cpp ${SRCS})
target_link_libraries(main OrbbecSDK ${OpenCV_LIBS})
target_link_libraries(main ${Open3D_LIBRARIES})
target_include_directories(main PUBLIC ${Open3D_INCLUDE_DIRS})
target_link_libraries(main onnxruntime)
依旧基于cmake-gui与vs,生成对应的exe文件,运行exe的时候记得将下面几个dll拷贝到同一个目录下,不然会报错。
此外将camra.cpp中的ProcessPointcloud下面的改成1,就可以可视化抓取点与姿态。
3 结语
实测奥比中光的相机彩色图很暗,调了很久曝光参数啥的也没realsense的彩色图看起来真实。可能跟我买的相机型号有关。
最后说一下,这个博客的内容适合有一点c++基础的食用,至少要会用cmake、vs嘛。