【Ollama】大模型运行框架

发布于:2025-03-31 ⋅ 阅读:(19) ⋅ 点赞:(0)

官网
github中文介绍

安装与运行

安装教程
安装

wget https://ollama.com/download/ollama-linux-amd64.tgz
tar -xzvf ollama-linux-amd64.tgz

添加ollama的环境变量:export OLLAMA_HOME=/data1/ztshao/programs/ollama-linux-amd64
然后把ollama/bin添加到path里。
运行:ollama serve
检测运行:ollama -v

导入LLM

GGUF是一种存储LLM的格式。ollama选用了这种格式。所以hugginface下下来的llm需要转换为gguf格式。

Hugginface模型-转换为-GGUF模型
  1. 先下载GGUF的转换代码。
git clone https://github.com/ggerganov/llama.cpp.git
  1. 进行转换得到.gguf文件。格式为python convert_hf_to_gguf.py <iput_model_path> --outfile <out_gguf_path> --outtype f16。注意out_gguf_path的后缀为.gguf
python convert_hf_to_gguf.py ../Qwen2.5-7B-Instruct --outfile Qwen2.5-7B-Instruct.gguf --outtype f16

注意.gguf文件存储在model文件夹内部

  1. ollama运行模型
    先构造Modelfile文件:
FROM ./Qwen2.5-7B-Instruct.gguf

无量化版本:ollama create MyQwen2.5-7B-Instruct -f ./Modelfile
带量化版本:ollama create -q Q4_K_M MyQwen2.5-7B-Instruct -f ./Modelfile

  1. 查看ollama内部模型列表:ollama list
  2. 运行模型:ollama run MyQwen2.5-7B-Instruct
  3. 删除模型:ollama rm MyQwen2.5-7B-Instruct
在指定gpu上运行

失败版本:
创建./ollama_gpu_selector.sh,内容为:
参考代码

#!/bin/bash

# Validate input
validate_input(){
if [[ ! $1 =~ ^[0-4](,[0-4])*$ ]];then
        echo "Error: Invalid input. Please enter numbers between 0 and 4, separated by commas."
exit 1
fi
}

# Update the service file with CUDA_VISIBLE_DEVICES values
update_service(){
# Check if CUDA_VISIBLE_DEVICES environment variable exists in the service file
if grep -q '^Environment="CUDA_VISIBLE_DEVICES='/etc/systemd/system/ollama.service;then
# Update the existing CUDA_VISIBLE_DEVICES values
        sudo sed -i 's/^Environment="CUDA_VISIBLE_DEVICES=.*/Environment="CUDA_VISIBLE_DEVICES='"$1"'"/'/etc/systemd/system/ollama.service
else
# Add a new CUDA_VISIBLE_DEVICES environment variable
        sudo sed -i '/\[Service\]/a Environment="CUDA_VISIBLE_DEVICES='"$1"'"'/etc/systemd/system/ollama.service
fi

# Reload and restart the systemd service
    sudo systemctl daemon-reload
    sudo systemctl restart ollama.service

    echo "Service updated and restarted with CUDA_VISIBLE_DEVICES=$1"
}

# Check if arguments are passed
if [[ "$#" -eq 0 ]];then
# Prompt user for CUDA_VISIBLE_DEVICES values if no arguments are passed
    read -p "Enter CUDA_VISIBLE_DEVICES values (0-4, comma-separated): " cuda_values
    validate_input "$cuda_values"
    update_service "$cuda_values"
else
# Use arguments as CUDA_VISIBLE_DEVICES values
    cuda_values="$1"
    validate_input "$cuda_values"
    update_service "$cuda_values"
fi

成功版:
我没有root权限,所以直接在.bashrc里修改了变量:

export CUDA_DEVICE_ORDER="PCI_BUS_ID"
export CUDA_VISIBLE_DEVICES=4

然后执行bashrc,重启ollama:

source ~/.bashrc
ollama serve
ollama run MyQwen2.5-7B-Instruct

查看ollama的模型运行情况:ollama ps

model存储路径设置

参考

ollama接口