Windows 11+Visual Studio 2022 环境OpenCV+CUDA 12.5安装及踩坑笔记

发布于:2024-07-27 ⋅ 阅读:(25) ⋅ 点赞:(0)

周六日在家捣腾了一下,把过程记录下来。

前置条件

  1. Visual Studio C++ 生成工具
  2. 和本机显卡适配的CUDA
  3. 与CUDA匹配的cuDNN
  4. Python 3
  5. NumPy
  6. OpenCV源代码以及对应版本的OpenCV-contrib模块源码
  7. CMake

Visual Studio

下载Visual Studio(我本机的是VS2022),通过Visual Studio Installer安装程序,安装C++工具集(或C++工作负荷),详细安装过程可参考这里

CUDA和cuDNN

下载安装最新版的CUDA Toolkit,注意与本地GPU兼容,或者检查本地路径,看是否已经安装CUDA工具包。以我本机为例,CUDA12.5安装在C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v12.5。同上,登录NVIDIA账号下载cuDNN,并将cuDNN文件中的内容复制到CUDA Toolkit所在目录(如C:\Program Files\NVIDIA\CUDNN\vX.X)的binincludelib/x64等文件夹下,我本机的cuDNN9.2.1版。

Python、NumPy及pip

安装Python3.x版本,由于需要使用numpy矩阵替代cv:Mat,,故还需安装numpy,保证已经安装好numpy(pip install numpy)并确保包括opencv-python和opencv-contrib-python等opencv包卸载干净。

pip uninstall opencv-python

pip uninstall opencv-contrib-python

删除cv2目录——YOUR_PYTHON_PATH/Lib/site-packages/cv2

OpenCV

从github仓库下载,或克隆仓库到本地,内容包括OpenCV及版本匹配的opencv-contrib。

CMake配置

给opencv和opencv-contrib创建build目录,然后配置cmake。Cmake配置可参考官网链接:OpenCV: OpenCV configuration options reference

这是一个漫长的过程,中途需要下载3rdparty文件夹里引用的第三方内容,个别库还可能出错,需要手工下载。

本例我们把Python也选上:

General configuration for OpenCV 4.10.0 =====================================

Version control: unknown



Platform:

Timestamp: 2024-07-20T06:31:04Z

Host: Windows 10.0.22631 AMD64

CMake: 3.29.0

CMake generator: Visual Studio 17 2022

CMake build tool: C:/Program Files/Microsoft Visual Studio/2022/Enterprise/MSBuild/Current/Bin/amd64/MSBuild.exe

MSVC: 1940

Configuration: Debug Release



CPU/HW features:

Baseline: SSE SSE2 SSE3

requested: SSE3

Dispatched code generation: SSE4_1 SSE4_2 FP16 AVX AVX2 AVX512_SKX

requested: SSE4_1 SSE4_2 AVX FP16 AVX2 AVX512_SKX

SSE4_1 (18 files): + SSSE3 SSE4_1

SSE4_2 (2 files): + SSSE3 SSE4_1 POPCNT SSE4_2

FP16 (1 files): + SSSE3 SSE4_1 POPCNT SSE4_2 FP16 AVX

AVX (9 files): + SSSE3 SSE4_1 POPCNT SSE4_2 AVX

AVX2 (38 files): + SSSE3 SSE4_1 POPCNT SSE4_2 FP16 FMA3 AVX AVX2

AVX512_SKX (8 files): + SSSE3 SSE4_1 POPCNT SSE4_2 FP16 FMA3 AVX AVX2 AVX_512F AVX512_COMMON AVX512_SKX



C/C++:

Built as dynamic libs?: YES

C++ standard: 11

C++ Compiler: C:/Program Files/Microsoft Visual Studio/2022/Enterprise/VC/Tools/MSVC/14.40.33807/bin/Hostx64/x64/cl.exe (ver 19.40.33812.0)

C++ flags (Release): /DWIN32 /D_WINDOWS /W4 /GR /D _CRT_SECURE_NO_DEPRECATE /D _CRT_NONSTDC_NO_DEPRECATE /D _SCL_SECURE_NO_WARNINGS /Gy /bigobj /Oi /fp:precise /EHa /wd4127 /wd4251 /wd4324 /wd4275 /wd4512 /wd4589 /wd4819 /MP /O2 /Ob2 /DNDEBUG

C++ flags (Debug): /DWIN32 /D_WINDOWS /W4 /GR /D _CRT_SECURE_NO_DEPRECATE /D _CRT_NONSTDC_NO_DEPRECATE /D _SCL_SECURE_NO_WARNINGS /Gy /bigobj /Oi /fp:precise /EHa /wd4127 /wd4251 /wd4324 /wd4275 /wd4512 /wd4589 /wd4819 /MP /Zi /Ob0 /Od /RTC1

C Compiler: C:/Program Files/Microsoft Visual Studio/2022/Enterprise/VC/Tools/MSVC/14.40.33807/bin/Hostx64/x64/cl.exe

C flags (Release): /DWIN32 /D_WINDOWS /W3 /D _CRT_SECURE_NO_DEPRECATE /D _CRT_NONSTDC_NO_DEPRECATE /D _SCL_SECURE_NO_WARNINGS /Gy /bigobj /Oi /fp:precise /MP /O2 /Ob2 /DNDEBUG

C flags (Debug): /DWIN32 /D_WINDOWS /W3 /D _CRT_SECURE_NO_DEPRECATE /D _CRT_NONSTDC_NO_DEPRECATE /D _SCL_SECURE_NO_WARNINGS /Gy /bigobj /Oi /fp:precise /MP /Zi /Ob0 /Od /RTC1

Linker flags (Release): /machine:x64 /INCREMENTAL:NO

Linker flags (Debug): /machine:x64 /debug /INCREMENTAL

ccache: NO

Precompiled headers: YES

Extra dependencies:

3rdparty dependencies:



OpenCV modules:

To be built: calib3d core dnn features2d flann gapi highgui imgcodecs imgproc java ml objdetect photo python3 stitching ts video videoio

Disabled: world

Disabled by dependency: -

Unavailable: python2

Applications: tests perf_tests apps

Documentation: NO

Non-free algorithms: NO



Windows RT support: NO



GUI: WIN32UI

Win32 UI: YES

VTK support: NO



Media I/O:

ZLib: build (ver 1.3.1)

JPEG: build-libjpeg-turbo (ver 3.0.3-70)

SIMD Support Request: YES

SIMD Support: NO

WEBP: build (ver encoder: 0x020f)

PNG: build (ver 1.6.43)

SIMD Support Request: YES

SIMD Support: YES (Intel SSE)

TIFF: build (ver 42 - 4.6.0)

JPEG 2000: build (ver 2.5.0)

OpenEXR: build (ver 2.3.0)

HDR: YES

SUNRASTER: YES

PXM: YES

PFM: YES



Video I/O:

DC1394: NO

FFMPEG: YES (prebuilt binaries)

avcodec: YES (58.134.100)

avformat: YES (58.76.100)

avutil: YES (56.70.100)

swscale: YES (5.9.100)

avresample: YES (4.0.0)

GStreamer: NO

DirectShow: YES

Media Foundation: YES

DXVA: YES



Parallel framework: Concurrency



Trace: YES (with Intel ITT)



Other third-party libraries:

Intel IPP: 2021.11.0 [2021.11.0]

at: D:/Data/source/collection/OpenCV/4100/build/3rdparty/ippicv/ippicv_win/icv

Intel IPP IW: sources (2021.11.0)

at: D:/Data/source/collection/OpenCV/4100/build/3rdparty/ippicv/ippicv_win/iw

Lapack: NO

Eigen: NO

Custom HAL: NO

Protobuf: build (3.19.1)

Flatbuffers: builtin/3rdparty (23.5.9)



OpenCL: YES (NVD3D11)

Include path: D:/Data/source/collection/OpenCV/4100/opencv-4.10.0/3rdparty/include/opencl/1.2

Link libraries: Dynamic load



Python 3:

Interpreter: D:/miniconda3/python.exe (ver 3.11.7)

Libraries: D:/miniconda3/libs/python311.lib (ver 3.11.7)

Limited API: NO

numpy: D:/miniconda3/Lib/site-packages/numpy/core/include (ver 1.26.1)

install path: D:/miniconda3/Lib/site-packages/cv2/python-3.11



Python (for build): D:/miniconda3/python.exe



Java:

ant: NO

Java: YES (ver 1.8.0.371)

JNI: D:/Program Files/jdk-1.8/include D:/Program Files/jdk-1.8/include/win32 D:/Program Files/jdk-1.8/include

Java wrappers: YES (JAVA)

Java tests: NO



Install to: D:/Data/source/collection/OpenCV/4100/build/install

-----------------------------------------------------------------



Configuring done (136.2s)

配置后好再修改以下两个参数,其中CUDA_ARCH_BIN找到CUDA Toolkit后,目前的版本会自动选上。

之前还生成了VTK,故加上了VTK路径(这是VTK的cmake生成路径):

点击“生成”按钮,生成VS工程。

从图可见,Java和Python的绑定工程都有了。

Visual Studio生成

执行ALL_BUILD生成工程,执行INSTALL进行安装。

生成和安装是一个漫长的等待……

注:要直接安装到Python环境中,需要用管理员身份打开VS,然后生成INSTALL项目。

安装效果及应用

#pragma comment(lib, "opencv_core4100.lib")

#include <iostream>
#include <opencv2/core/cuda.hpp>

int main()
{
    int deviceCount = cv::cuda::getCudaEnabledDeviceCount();
    std::cout << "CUDA Device Number: " << deviceCount << std::endl;
    cv::cuda::printCudaDeviceInfo(0);
}

Python 3.11.7 | packaged by Anaconda, Inc. | (main, Dec 15 2023, 18:05:47) [MSC v.1916 64 bit (AMD64)] on win32

Type "help", "copyright", "credits" or "license" for more information.

>>> import cv2

>>> print(cv2.__version__)

4.10.0

>>> print(cv2.cuda.getCudaEnabledDeviceCount())

1

>>> cv2.cuda.printCudaDeviceInfo(0)

*** CUDA Device Query (Runtime API) version (CUDART static linking) ***



Device count: 1



Device 0: "NVIDIA GeForce RTX 3070 Ti Laptop GPU"

  CUDA Driver Version / Runtime Version          12.50 / 12.50

  CUDA Capability Major/Minor version number:    8.6

  Total amount of global memory:                 8192 MBytes (8589410304 bytes)

  GPU Clock Speed:                               1.41 GHz

  Max Texture Dimension Size (x,y,z)             1D=(131072), 2D=(131072,65536), 3D=(16384,16384,16384)

  Max Layered Texture Size (dim) x layers        1D=(32768) x 2048, 2D=(32768,32768) x 2048

  Total amount of constant memory:               65536 bytes

  Total amount of shared memory per block:       49152 bytes

  Total number of registers available per block: 65536

  Warp size:                                     32

  Maximum number of threads per block:           1024

  Maximum sizes of each dimension of a block:    1024 x 1024 x 64

  Maximum sizes of each dimension of a grid:     2147483647 x 65535 x 65535

  Maximum memory pitch:                          2147483647 bytes

  Texture alignment:                             512 bytes

  Concurrent copy and execution:                 Yes with 1 copy engine(s)

  Run time limit on kernels:                     Yes

  Integrated GPU sharing Host Memory:            No

  Support host page-locked memory mapping:       Yes

  Concurrent kernel execution:                   Yes

  Alignment requirement for Surfaces:            Yes

  Device has ECC support enabled:                No

  Device is using TCC driver mode:               No

  Device supports Unified Addressing (UVA):      Yes

  Device PCI Bus ID / PCI location ID:           1 / 0

  Compute Mode:

      Default (multiple host threads can use ::cudaSetDevice() with device simultaneously)



deviceQuery, CUDA Driver = CUDART, CUDA Driver Version  = 12.50, CUDA Runtime Version = 12.50, NumDevs = 1

遇到的几个问题

  1. Visual Studio 已安装的Python版本影响

之前安装VS2022时装了Python开发负荷(Python3.9),导致cmake的时候绑死了该环境,且指向conda里的Python环境,其Libraries还是指向3.9。卸载VS中的Python可以解决。

    2. 缺Nvidia Video Codec SDK导致的警告

CMake Warning at D:/Data/source/collection/OpenCV/4100/opencv_contrib-4.10.0/modules/cudacodec/CMakeLists.txt:26 (message):
cudacodec::VideoReader requires Nvidia Video Codec SDK. Please resolve
dependency or disable WITH_NVCUVID=OFF

CMake Warning at D:/Data/source/collection/OpenCV/4100/opencv_contrib-4.10.0/modules/cudacodec/CMakeLists.txt:30 (message):
cudacodec::VideoWriter requires Nvidia Video Codec SDK. Please resolve
dependency or disable WITH_NVCUVENC=OFF

下载nvidia Video Codec SDK,并把lib和头文件(interface目录)分别复制到cuda toolkit的lib/x64和include目录,问题解决。

    3. CUDA版本问题导致的错误CMake Error at cmake/OpenCVDetectCUDAUtils.cmake :297 (list)  list GET given empty list

因我的Visual Studio是17.10.4,在CUDA12.2上构建,则会出现这个问题,因为根据官方文档,CUDA Toolkit 12.2 只支持到17.0的Visual Studio,如下图:

CUDA Installation Guide Microsoft Windows (nvidia.com)

更换为CUDA 12.5,可以解决这个问题:

https://docs.nvidia.com/cuda/archive/12.5.0/cuda-installation-guide-microsoft-windows/index.html#system-requirements

    5. Python使用CV2时dll缺失错误

ImportError: DLL load failed while importing cv2: 找不到指定的模块。

Traceback (most recent call last):

  File "<stdin>", line 1, in <module>

  File "D:\miniconda3\Lib\site-packages\cv2\__init__.py", line 181, in <module>

    bootstrap()

  File "D:\miniconda3\Lib\site-packages\cv2\__init__.py", line 153, in bootstrap

    native_module = importlib.import_module("cv2")

                    ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

  File "D:\miniconda3\Lib\importlib\__init__.py", line 126, in import_module

    return _bootstrap._gcd_import(name[level:], package, level)

提示dll缺失,使用ProcessMonitor,添加python.exe过滤器,重现错误,追出出错原因:

发现原来是自己编译VTK带来的锅,自己搞的VTK,含着泪也要把它搞定,So,加到cv2的config.py中,但导致别的错误(都怪自己,把VTK的debug版和release版放一起了),单独抽取当中的release版,加入到环境变量或cv2的config.py,或者直接拷贝到site-packages->cv2->python-3.11目录。搞定,问题解决。

参考资料

Quick and Easy OpenCV Python Installation with Cuda GPU in Under 10 Minutes (youtube.com)

GitHub - chrismeunier/OpenCV-CUDA-installation: Saving the process to install OpenCV for Python 3 with CUDA bindings

Unable to enable Cudacodec VideoReader · Issue #11220 · opencv/opencv · GitHub

OpenCV: OpenCV configuration options reference

CUDA Installation Guide for Microsoft Windows (nvidia.com)

windows10+VS2022编译安装opencv-python_重新编译opencv-python-CSDN博客


网站公告

今日签到

点亮在社区的每一天
去签到