deepspeed安装报错 No module named ‘dskernels‘解决

发布于:2024-09-19 ⋅ 阅读:(8) ⋅ 点赞:(0)

pip install deepseek安装报错

#

  Using cached https://pypi.tuna.tsinghua.edu.cn/packages/61/e6/04e2f2de08253e6b779fe7706f2e06d8fb48353e1d33a2fd7805062213d4/deepspeed-0.12.3.tar.gz (1.2 MB)
  Preparing metadata (setup.py) ... error
  error: subprocess-exited-with-error

  × python setup.py egg_info did not run successfully.
  │ exit code: 1
  ╰─> [16 lines of output]
      Traceback (most recent call last):
        File "<string>", line 2, in <module>
        File "<pip-setuptools-caller>", line 34, in <module>
        File "C:\Users\Gigabyte\AppData\Local\Temp\pip-install-ld11i_m4\deepspeed_ab5cdb8c6e9d4fef8c9a63f1f2afa7cc\setup.py", line 100, in <module>
          cuda_major_ver, cuda_minor_ver = installed_cuda_version()
        File "C:\Users\Gigabyte\AppData\Local\Temp\pip-install-ld11i_m4\deepspeed_ab5cdb8c6e9d4fef8c9a63f1f2afa7cc\op_builder\builder.py", line 43, in installed_cuda_version
          output = subprocess.check_output([cuda_home + "/bin/nvcc", "-V"], universal_newlines=True)
        File "D:\anaconda3\envs\got\lib\subprocess.py", line 421, in check_output
          return run(*popenargs, stdout=PIPE, timeout=timeout, check=True,
        File "D:\anaconda3\envs\got\lib\subprocess.py", line 503, in run
          with Popen(*popenargs, **kwargs) as process:
        File "D:\anaconda3\envs\got\lib\subprocess.py", line 971, in __init__
          self._execute_child(args, executable, preexec_fn, close_fds,
        File "D:\anaconda3\envs\got\lib\subprocess.py", line 1456, in _execute_child
          hp, ht, pid, tid = _winapi.CreateProcess(executable, args,
      FileNotFoundError: [WinError 2] 系统找不到指定的文件。
      [end of output]

  note: This error originates from a subprocess, and is likely not a problem with pip.
error: metadata-generation-failed

× Encountered error while generating package metadata.
╰─> See above for output.

note: This is an issue with the package mentioned above, not pip.
hint: See above for details.

错误一:需要安装对应cuda

在这里插入图片描述

conda install pytorch==2.4.0 torchvision==0.19.0 torchaudio==2.4.0 pytorch-cuda=12.4 -c pytorch -c nvidia

错误2 No module named ‘dskernels’

https://github.com/microsoft/DeepSpeed?tab=readme-ov-file

git clone https://github.com/microsoft/DeepSpeed.git
python setup.py bdist_wheel

报错No module named ‘dskernels’

[2024-09-18 15:27:34,537] [INFO] [real_accelerator.py:203:get_accelerator] Setting ds_accelerator to cuda (auto detect)
[2024-09-18 15:27:35,688] [INFO] [real_accelerator.py:203:get_accelerator] Setting ds_accelerator to cuda (auto detect)
test.c
LINK : fatal error LNK1181: 无法打开输入文件“aio.lib”
test.c
LINK : fatal error LNK1181: 无法打开输入文件“cufile.lib”
W0918 15:27:39.535000 20116 torch\distributed\elastic\multiprocessing\redirects.py:28] NOTE: Redirects are currently not supported in Windows or MacOs.
DS_BUILD_OPS=1
test.c
LINK : fatal error LNK1181: 无法打开输入文件“aio.lib”
 [WARNING]  Skip pre-compile of incompatible evoformer_attn; One can disable evoformer_attn with DS_BUILD_EVOFORMER_ATTN=0
 [WARNING]  Skip pre-compile of incompatible fp_quantizer; One can disable fp_quantizer with DS_BUILD_FP_QUANTIZER=0
test.c
LINK : fatal error LNK1181: 无法打开输入文件“cufile.lib”
 [WARNING]  Skip pre-compile of incompatible gds; One can disable gds with DS_BUILD_GDS=0
 [WARNING]  Filtered compute capabilities ['6.0', '6.1', '7.0']
Traceback (most recent call last):
  File "D:\Projects\DeepSpeed-master\setup.py", line 200, in <module>
    ext_modules.append(builder.builder())
  File "D:\Projects\DeepSpeed-master\op_builder\builder.py", line 722, in builder
    extra_link_args=self.strip_empty_entries(self.extra_ldflags()))
  File "D:\Projects\DeepSpeed-master\op_builder\inference_cutlass_builder.py", line 74, in extra_ldflags
    import dskernels
ModuleNotFoundError: No module named 'dskernels'

参考:https://blog.csdn.net/weixin_46587777/article/details/137937431

在这里插入图片描述

set DS_BUILD_OPS=0

安装成功
在这里插入图片描述