文章目录
前言
提示:这里可以添加本文要记录的大概内容:
例如:随着人工智能的不断发展,机器学习这门技术也越来越重要,很多人都开启了学习机器学习,本文就介绍了机器学习的基础内容。
提示:以下是本篇文章正文内容,下面案例可供参考
一、下模
示例:pandas 是基于NumPy 的一种工具,该工具是为了解决数据分析任务而创建的。
二、转模
如果模型是hf,需要转成gguf格式,如果在下模的时候想要的那个模型有gguf的,那就直接下,不用转。如果没有,需要转一下,因为ollama部署必须是gguf格式。
1. 下载转换工具
(base) defaultuser@qin-h100-jumper-server:/mnt/ollama/deepseek/llamacpp$ sudo git clone https://github.com/ggerganov/llama.cpp
Cloning into 'llama.cpp'...
remote: Enumerating objects: 44551, done.
remote: Counting objects: 100% (140/140), done.
remote: Compressing objects: 100% (98/98), done.
Receiving objects: 63% (28068/44551), 71.18 MiB | 2.54 MiB/s
2. 安装环境依赖
(base) defaultuser@qin-h100-jumper-server:/mnt/ollama/deepseek/llamacpp/llama.cpp$ pip install -e .
WARNING: Ignoring invalid distribution ~riton (/home/defaultuser/anaconda3/lib/python3.12/site-packages)
WARNING: Ignoring invalid distribution ~orch (/home/defaultuser/anaconda3/lib/python3.12/site-packages)
Obtaining file:///mnt/ollama/deepseek/llamacpp/llama.cpp
Installing build dependencies ... done
Checking if build backend supports build_editable ... done
Getting requirements to build editable ... done
Downloading protobuf-4.25.6-cp37-abi3-manylinux2014_x86_64.whl (294 kB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 294.6/294.6 kB 107.8 kB/s eta 0:00:00
Downloading torch-2.6.0-cp312-cp312-manylinux1_x86_64.whl (766.6 MB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 0.9/766.6 MB 60.9 kB/s eta 3:29:38
3. llama.cpp
1. 转换脚本依赖
安装convert_hf_to_gguf.py脚本所需的依赖包。
(base) defaultuser@qin-h100-jumper-server:/mnt/ollama/deepseek/llamacpp/llama.cpp$ pip install -r requirements.txt
Looking in indexes: https://pypi.org/simple, https://download.pytorch.org/whl/cpu, https://download.pytorch.org/whl/cpu, https://download.pytorch.org/whl/cpu, https://download.pytorch.org/whl/cpu
Requirement already satisfied: numpy~=1.26.4 in /home/defaultuser/anaconda3/lib/python3.12/site-packages (from -r ./requirements/requirements-convert_legacy_llama.txt (line 1)) (1.26.4)
/home/defaultuser/anaconda3/lib/python3.12/site-packages (from -r ./requirements/requirements-convert_legacy_llama.txt (line 4)) (0.10.0)
Collecting protobuf<5.0.0,>=4.21.0 (from -r
Downloading https://download.pytorch.org/whl/cpu/torch-2.2.2%2Bcpu-cp312-cp312-linux_x86_64.whl (186.7 MB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 186.7/186.7 MB 6.1 MB/s eta 0:00:00
Requirement already satisfied: filelock in /home/defaultuser/anaconda3/lib/python3.12/site-packages (from transformers<5.0.0,>=4.45.1->-r ./requirements/requirements-convert_legacy_llama.txt (line 3)) (3.17.0)
2. llama.cpp安装依赖包
接下来需要安装convert_hf_to_gguf.py脚本所需的依赖包。
(base) defaultuser@qin-h100-jumper-server:/mnt/ollama/deepseek/llamacpp/llama.cpp$ pip install -r requirements/requirements-convert_hf_to_gguf.txt
WARNING: Ignoring invalid distribution ~riton (/home/defaultuser/anaconda3/lib/python3.12/site-packages)
Looking in indexes: https://pypi.org/simple, https://download.pytorch.org/whl/cpu
Requirement already satisfied: numpy~=1.26.4 in /home/defaultuser/anaconda3/lib/python3.12/site-packages (from -r requirements/./requirements-convert_legacy_llama.txt (line 1)) (1.26.4)
Requirement already satisfied: sentencepiece~=0.2.0 in /home/defaultuser/anaconda3/lib/python3.12/site-packages (from -r requirements/./requirements-convert_legacy_llama.txt (line 2)) (0.2.0)
Requirement already satisfied: transformers<5.0.0,>=4.45.1 in /home/defaultuser/anaconda3/lib/python3.12/site-packages (from -r requirements/./requirements-convert_legacy_llama.txt (line 3)) (4.49.0)
Requirement already satisfied: gguf>=0.1.0 in /home/defaultuser/anaconda3/lib/python3.12/site-packages (from -r requirements/./requirements-convert_legacy_llama.txt (line 4)) (0.10.0)
3. llama.cpp编译安装
直接make
(base) defaultuser@qin-h100-jumper-server:/mnt/ollama/deepseek/llamacpp/llama.cpp$ make
Makefile:2: *** The Makefile build is deprecated. Use the CMake build instead. For more details, see https://github.com/ggml-org/llama.cpp/blob/master/docs/build.md. Stop.
(base) defaultuser@qin-h100-jumper-server:/mnt/ollama/deepseek/llamacpp/llama.cpp$
报错后需要先装cmake
(base) defaultuser@qin-h100-jumper-server:/mnt/ollama/deepseek/llamacpp/llama.cpp$ sudo apt install cmake
Reading package lists... Done
Building dependency tree... Done
Reading state information... Done
The following additional packages will be installed:
cmake-data dh-elpa-helper emacsen-common libjsoncpp25 librhash0
Suggested packages:
cmake-doc ninja-build cmake-format
使用cmake配置构建
defaultuser@qin-h100-jumper-server:/mnt/ollama/deepseek/llamacpp/llama.cpp/build$ sudo cmake ..
-- The C compiler identification is GNU 11.4.0
-- The CXX compiler identification is GNU 11.4.0
-- Detecting C compiler ABI info
-- CMAKE_SYSTEM_PROCESSOR: x86_64
-- Including CPU backend
-- Found OpenMP_C: -fopenmp (found version "4.5")
-- Found OpenMP_CXX: -fopenmp (found version "4.5")
-- Found OpenMP: TRUE (found version "4.5")
-- x86 detected
-- Adding CPU backend variant ggml-cpu: -march=native
-- Configuring done
-- Generating done
-- Build files have been written to: /mnt/ollama/deepseek/llamacpp/llama.cpp/build
(base) defaultuser@qin-h100-jumper-server:/mnt/ollama/deepseek/llamacpp/llama.cpp/build$
构建项目
(base) defaultuser@qin-h100-jumper-server:/mnt/ollama/deepseek/llamacpp/llama.cpp/build$ sudo cmake --build . --config Release
[ 0%] Building C object ggml/src/CMakeFiles/ggml-base.dir/ggml.c.o
[ 1%] Building C object ggml/src/CMakeFiles/ggml-base.dir/ggml-alloc.c.o
[ 1%] Building CXX object ggml/src/CMakeFiles/ggml-base.dir/ggml-backend.cpp.o
[ 2%] Building CXX object ggml/src/CMakeFiles/ggml-base.dir/ggml-opt.cpp.o
[ 2%] Building CXX object ggml/src/CMakeFiles/ggml-base.dir/ggml-threading.cpp.o
[ 3%] Building C object ggml/src/CMakeFiles/ggml-base.dir/ggml-quants.c.o
[ 3%] Building CXX object ggml/src/CMakeFiles/ggml-base.dir/gguf.cpp.o
[ 4%] Linking CXX shared library ../../bin/libggml-base.so
4. 格式转换
python convert_hf_to_gguf.py /mnt/ollama/deepseek/deepseek-ai/DeepSeek-R1-Distill-Llama-70B --outtype f16 --outfile /mnt/ollama/deepseek/deepseek-ai/DeepSeek-R1-Distill-Llama-70B.gguf
开始转换,等待转换完成
INFO:hf-to-gguf:gguf: embedding length = 8192
INFO:hf-to-gguf:gguf: feed forward length = 28672
INFO:hf-to-gguf:gguf: head count = 64
INFO:hf-to-gguf:gguf: key-value head count = 8
INFO:hf-to-gguf:gguf: rope theta = 500000.0
INFO:hf-to-gguf:gguf: rms norm epsilon = 1e-05
INFO:hf-to-gguf:gguf: file type = 1
INFO:hf-to-gguf:Set model tokenizer
INFO:gguf.vocab:Adding 280147 merge(s).
INFO:gguf.vocab:Setting special token type bos to 128000
INFO:gguf.vocab:Setting special token type eos to 128001
INFO none %}{%- set ns.is_tool = false -%}{%- for tool in message['tool_calls']%}{%- if not ns.is_first %}{{'<|Assistant|><|tool▁calls▁begin|><|tool▁call▁begin|>' + tool['type'] + |>'}}{%- endif %}{%- endif %}{%- endfor -%}{% if ns.is_tool %}{{'<|tool▁outputs▁end|>'}}{% endif %}{% if add_generation_prompt and not ns.is_tool %}{{'<|Assistant|><think>\n'}}{% endif %}
INFO:hf-to-gguf:Set model quantization version
INFO:gguf.gguf_writer:Writing the following files:
INFO:gguf.gguf_writer:/mnt/ollama/deepseek/deepseek-ai/DeepSeek-R1-Distill-Llama-70B.gguf: n_tensors = 724, total_size = 141.1G
NFO:hf-to-gguf:Set model quantization version
INFO:gguf.gguf_writer:Writing the following files:
INFO:gguf.gguf_writer:/mnt/ollama/deepseek/deepseek-ai/DeepSeek-R1-Distill-Llama-70B.gguf: n_tensors = 724, total_size = 141.1G
Writing: 12%|██████████▏ | 17.6G/141G [01:33<12:08, 170Mbyte/s
二、Ollama部署
1. 安装启动Ollama
curl https://ollama.com/install.sh | sh
ollama serve
2. 添加模型
在模型同级目下创建一个文件Modefile,把模型名字写入
(base) root@test-6U8X-EGS2:/mnt/ollama/deepseek# ls
anaconda3.tgz DeepSeek-R1 DeepSeek-R1-Distill-Llama-70B.gguf home
code DeepSeek-R1-Distill-Llama-70B DeepSeek-R1-Distill-Qwen-1___5B vllm
(base) root@test-6U8X-EGS2:/mnt/ollama/deepseek# touch Modelfile
(base) root@test-6U8X-EGS2:/mnt/ollama/deepseek# nano Modelfile
(base) root@test-6U8X-EGS2:/mnt/ollama/deepseek# cat Modelfile
FROM ./DeepSeek-R1-Distill-Llama-70B.gguf
(base) root@test-6U8X-EGS2:/mnt/ollama/deepseek#
把模型导入ollama,然后list查看是有已有,可以看到latest就是最新导入的。
(base) root@test-6U8X-EGS2:/home/test/ollama-main/bin# ./ollama create DeepSeek-R1-Distill-Llama-70B -f /mnt/ollama/deepseek/M
odelfile
gathering model components
copying file sha256:bf5df985d4bcfffe15c57780fe7b362b320c6d3d86b07ca16976d511374ed970 100%
parsing GGUF
using existing layer sha256:bf5df985d4bcfffe15c57780fe7b362b320c6d3d86b07ca16976d511374ed970
writing manifest
success
(base) root@test-6U8X-EGS2:/home/test/ollama-main/bin# ./ollama list
NAME ID SIZE MODIFIED
DeepSeek-R1-Distill-Llama-70B:latest 22bd5a29702f 141 GB 45 seconds ago
deepseek-r1-q2:671b bf71f995ebf9 226 GB 38 hours ago
deepseek-r1:671b 739e1b229ad7 404 GB 4 days ago
deepseek-r1:70b 0c1615a8ca32 42 GB 5 days ago
(base) root@test-6U8X-EGS2:/home/test/ollama-main/bin#
3. 测试运行
(base) root@test-6U8X-EGS2:/home/test/ollama-main/bin# ./ollama run DeepSeek-R1-Distill-Llama:70b
>>> 介绍下上海
<think>
嗯,用户让我“介绍下上海”。首先,我需要明确他们对上海的兴趣点是什么,是历史、经济、文化还是旅游景点?可能是一个概括性的介绍,所以我得全
面一点。
上海作为中国的第一大城市,有很多值得说的。国际化大都市、经济中心,这些基本信息是必须提到的。然后,地理位置在哪儿,人口多不多,这些基本
数据能帮助用户建立一个整体印象。
接下来,可以介绍一下上海的历史,比如开埠的时间,以及改革开放后的发展,这样用户能了解它的成长轨迹和现在的地位。
经济方面,金融中心、港口和产业布局是关键点。提到黄浦江两岸的变化,可以形象地展示城市的繁荣。
文化多样性也是上海的一大特点,东西方文化交融,这一点挺吸引人的。适合生活和旅游,可能用户也在考虑是否要去那里玩,所以可以稍微提一下著名
景点,比如外滩、城隍庙等,但不要展开太多,保持简洁。
最后,加上一些个人见解,说明上海的魅力所在,这样介绍会更生动。总之,要全面但不过于冗长,让用户对上海有一个清晰而丰富的了解。
</think>
上海是中国的一个国际化大都市,也是中国经济、金融、贸易和航运中心。以下是一些关于上海的基本介绍:
1. **地理位置**:
- 上海位于中国东部,长江入海口以北,黄浦江两岸,是中国人口最多的城市之一。
2. **历史与发展**:
- 上海自1843年开埠以来,就成为了中国对外开放的重要窗口。
- 20世纪90年代改革开放后,上海迅速发展成为全球重要的经济中心。
3. **经济与金融**:
- 上海是中国最大的经济中心,也是全球主要金融中心之一。上海证券交易所是中国最重要的股票交易市场之一。
- 上海港口是世界上最繁忙的港口之一,货物吞吐量和集装箱吞吐量位居世界前列。
4. **文化与多样性**:
- 上海是一个国际化城市,融合了中西方的文化。老城隍庙、城隍街等传统场所与外滩、南京路步行街等现代商业区相映成趣。
- 上海人以开放包容著称,适应各种文化和生活方式。
5. **旅游景点**:
- 外滩:上海的标志性景观之一,夜晚灯光璀璨。
- 城隍庙:传统美食和小吃集中地,也是上海文化的象征。
- 上海迪士尼乐园:适合家庭游玩。
- 珠宝阁:在东方明珠塔上,可以俯瞰整个上海的全景。
6. **生活**:
- 上海是一个充满活力的城市,拥有丰富的娱乐、购物和文化活动。同时,这里也吸引了大量外来务工人员和外籍人士。
总之,上海是一座既有传统韵味又充满现代气息的国际化都市,是一个值得探索的地方。