【BUG】Ubuntu|有nvcc,没有nvidia-smi指令,找不到nvidia-driver安装包

发布于:2025-02-15 ⋅ 阅读:(16) ⋅ 点赞:(0)

很奇怪,本来能使用的,放个假回来就用不了了。

排查了以下所有步骤最终解决。

我的Ubuntu版本:Ubuntu22

  1. nvcc -v:有。如果没有的话你需要安装“sudo apt-get install nvidia-cuda-toolkit”,其他问题请去别的博客查。
    sudo apt-get install nvidia-cuda-toolkit
    正在读取软件包列表... 完成
    正在分析软件包的依赖关系树... 完成
    正在读取状态信息... 完成                 
    nvidia-cuda-toolkit 已经是最新版 (12.0.140~12.0.1-4build4)。
    升级了 0 个软件包,新安装了 0 个软件包,要卸载 0 个软件包,有 196 个软件包未被升级。
    
  2. nvidia-smi:没有,如果使用zsh会提示command not found,如果使用bash会提示找不到命令,但可以通过以下软件包安装。
    nvidia-smi
    找不到命令 “nvidia-smi”,但可以通过以下软件包安装它:
    apt install nvidia-utils-470         # version 470.256.02-0ubuntu0.24.04.1, or
    apt install nvidia-utils-470-server  # version 470.256.02-0ubuntu0.24.04.1
    apt install nvidia-utils-535         # version 535.183.01-0ubuntu0.24.04.1
    apt install nvidia-utils-535-server  # version 535.216.01-0ubuntu0.24.04.1
    apt install nvidia-utils-550         # version 550.120-0ubuntu0.24.04.1
    apt install nvidia-utils-525         # version 525.147.05-0ubuntu1
    apt install nvidia-utils-525-server  # version 525.147.05-0ubuntu1
    apt install nvidia-utils-550-server  # version 550.127.05-0ubuntu0.24.04.1
    
  3. 尝试安装nvidia-driver:无法定位软件包
  4. 添加软件源:
    sudo add-apt-repository ppa:graphics-drivers/ppa
    sudo apt-get update
    
  5. 再次尝试安装nvidia-driver:仍然无法定位软件包
  6. 随便装一个:sudo apt-get install nvidia-utils-<随便找了个高的版本号>
  7. 再次运行nvidia-smi:发现好歹有输出了,但是提示版本不匹配。
    nvidia-smi
    Failed to initialize NVML: Driver/library version mismatch
    NVML library version: 550.144
    
  8. 网上说重启一下之后驱动会自动升级,于是我直接sudo reboot了一下,发现还是没用。
  9. 问了一下各种大模型,他们建议我查看一下驱动版本,然后选择带有"recommended"标识的版本(例如nvidia-driver-550):ubuntu-drivers devices
    (base) ➜  pure_experiments_env ubuntu-drivers devices                                           
    udevadm hwdb is deprecated. Use systemd-hwdb instead.
    udevadm hwdb is deprecated. Use systemd-hwdb instead.
    udevadm hwdb is deprecated. Use systemd-hwdb instead.
    udevadm hwdb is deprecated. Use systemd-hwdb instead.
    udevadm hwdb is deprecated. Use systemd-hwdb instead.
    udevadm hwdb is deprecated. Use systemd-hwdb instead.
    udevadm hwdb is deprecated. Use systemd-hwdb instead.
    udevadm hwdb is deprecated. Use systemd-hwdb instead.
    udevadm hwdb is deprecated. Use systemd-hwdb instead.
    udevadm hwdb is deprecated. Use systemd-hwdb instead.
    udevadm hwdb is deprecated. Use systemd-hwdb instead.
    udevadm hwdb is deprecated. Use systemd-hwdb instead.
    udevadm hwdb is deprecated. Use systemd-hwdb instead.
    udevadm hwdb is deprecated. Use systemd-hwdb instead.
    == /sys/devices/pci0000:3a/0000:3a:00.0/0000:3b:00.0 ==
    modalias : pci:v00
    vendor   : NVIDIA Corporation
    model    : AD102 [GeForce RTX 4090]
    manual_install: True
    driver   : nvidia-driver-560-open - third-party non-free
    driver   : nvidia-driver-545-open - third-party non-free
    driver   : nvidia-driver-550 - third-party non-free
    driver   : nvidia-driver-550-open - third-party non-free
    driver   : nvidia-driver-535-server-open - distro non-free
    driver   : nvidia-driver-535-open - third-party non-free
    driver   : nvidia-driver-535-server - distro non-free
    driver   : nvidia-driver-570-open - third-party non-free
    driver   : nvidia-driver-565-open - third-party non-free
    driver   : nvidia-driver-560 - third-party non-free recommended
    driver   : nvidia-driver-535 - third-party non-free
    driver   : nvidia-driver-570 - third-party non-free
    driver   : nvidia-driver-565 - third-party non-free
    driver   : nvidia-driver-545 - third-party non-free
    driver   : xserver-xorg-video-nouveau - distro free builtin
    
    嗨嗨嗨,原来是要加版本号才能找到nvidia-driver啊,怪不得装不上。
  10. 我装完560发现还是nvidia-smi指令找不到,看了下大模型的回复还是劝重启,我感觉这次可能真的需要重启,就又重启了一下:sudo reboot
  11. 发现果然好了!!!可喜可贺!可喜可贺啊!

在这里插入图片描述

末:最近博客之星投票,我的链接是:https://www.csdn.net/blogstar2024/detail/151,可以帮忙投个票吗,想拿前100的实体证书,一直在130左右浮动,就差一点点。

本账号所有文章均为原创,欢迎转载,请注明文章出处:https://shandianchengzi.blog.csdn.net/article/details/145640869。百度和各类采集站皆不可信,搜索请谨慎鉴别。技术类文章一般都有时效性,本人习惯不定期对自己的博文进行修正和更新,因此请访问出处以查看本文的最新版本。