RK3588 Linux平台部署DeepSeek模型教程

请添加图片描述

更多内容可以加入Linux系统知识库套餐（教程＋视频＋答疑）

文章目录

一、下载rknn-llm 和 deepseek模型
二、RKLLM-Toolkit 安装
- 2.1 安装 miniforge3 工具
- 2.2 下载 miniforge3 安装包
- 2.3 安装 miniforge3
三、创建 RKLLM-Toolkit Conda 环境
- 3.1 进入 Conda base 环境
- 3.2 创建一个 Python3.8 版本（建议版本）名为 RKLLM-Toolkit 的 Conda 环境
- 3.3 进入 RKLLM-Toolkit Conda 环境
四、安装 RKLLM-Toolkit
五、DeepSeek-R1-1.5B HunggingFace转换成RKLLM模型
六、RK3588端运行demo
七、推荐开发板

沉淀、分享、成长，让自己和他人都能有所收获！😄

• ubuntu20.04
• python3.8
• RK3588开发板

先上效果：
请添加图片描述

一、下载rknn-llm 和 deepseek模型

git clone https://github.com/airockchip/rknn-llm.git 
git clone https://huggingface.co/deepseek-ai/DeepSeek-R1-Distill-Qwen-1.5B

将rknn-llm的文件放入以下目录
在这里插入图片描述
将deepseek模型放入examples/DeepSeek-R1-Distill-Qwen-1.5B_Demo目录

二、RKLLM-Toolkit 安装

2.1 安装 miniforge3 工具

检查是否安装 miniforge3 和 conda 版本信息，若已安装则可省略此小节步骤

conda -V
# 提示 conda: command not found 则表示未安装 conda
# 提示 例如版本 conda 23.9.0

2.2 下载 miniforge3 安装包

wget -c https://github.com/condaforge/miniforge/releases/latest/download/Miniforge3-Linux-x86_64.sh

2.3 安装 miniforge3

chmod 777 Miniforge3-Linux-x86_64.sh
./Miniforge3-Linux-x86_64.sh

三、创建 RKLLM-Toolkit Conda 环境

3.1 进入 Conda base 环境

source ~/miniforge3/bin/activate
# (base) xxx@xxx-pc:~$

3.2 创建一个 Python3.8 版本（建议版本）名为 RKLLM-Toolkit 的 Conda 环境

conda create -n RKLLM-Toolkit python=3.8

3.3 进入 RKLLM-Toolkit Conda 环境

conda activate RKLLM-Toolkit
# (RKLLM-Toolkit) xxx@xxx-pc:~$

四、安装 RKLLM-Toolkit

在 RKLLM-Toolkit Conda 环境下使用 pip 工具直接安装所提供的工具链 whl 包，在安装过程中，安装工具会自动下载 RKLLM-Toolkit 工具所需要的相关依赖包。，

whl文件指定的是前面下载rknn-llm中的文件路径

在这里插入图片描述

pip3 install 1.1.4/rkllm-1.1.4/rkllm-toolkit/packages/rkllm_toolkit-1.1.4-cp38-
cp38-linux_x86_64.whl

若执行以下命令没有报错，则安装成功

(RKLLM-Toolkit) xxx@sys2206:~/temp/SDK$ python
Python 3.8.20 | packaged by conda-forge | (default, Sep 30 2024, 17:52:49)
[GCC 13.3.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> from rkllm.api import RKLLM
INFO: Note: NumExpr detected 64 cores but "NUMEXPR_MAX_THREADS" not set, so
enforcing safe limit of 8.
INFO: NumExpr defaulting to 8 threads.
>>>

在这里插入图片描述

五、DeepSeek-R1-1.5B HunggingFace转换成RKLLM模型

编写转换脚本 transform.py 保存到DeepSeek-R1-Distill-Qwen-1.5B目录下

from rkllm.api import RKLLM
from datasets import load_dataset
from transformers import AutoTokenizer
from tqdm import tqdm
import torch
from torch import nn
import os
# os.environ['CUDA_VISIBLE_DEVICES']='1'

modelpath = '.'
llm = RKLLM()

# Load model
# Use 'export CUDA_VISIBLE_DEVICES=2' to specify GPU device
# options ['cpu', 'cuda']
ret = llm.load_huggingface(model=modelpath, model_lora = None, device='cpu')
# ret = llm.load_gguf(model = modelpath)
if ret != 0:
    print('Load model failed!')
    exit(ret)

# Build model
dataset = "./data_quant.json"
# Json file format, please note to add prompt in the input，like this:
# [{"input":"Human: 你好！\nAssistant: ", "target": "你好！我是人工智能助手KK！"},...]

qparams = None
# qparams = 'gdq.qparams' # Use extra_qparams
ret = llm.build(do_quantization=True, optimization_level=1, quantized_dtype='w8a8',
                quantized_algorithm='normal', target_platform='rk3588', num_npu_core=3, extra_qparams=qparams, dataset=dataset)

#ret = llm.build(do_quantization=True, optimization_level=1, quantized_dtype='w8a8',
#                quantized_algorithm='normal', target_platform='rk3576', num_npu_core=2, extra_qparams=qparams, dataset=dataset)

if ret != 0:
    print('Build model failed!')
    exit(ret)

# Evaluate Accuracy
def eval_wikitext(llm):
    seqlen = 512
    tokenizer = AutoTokenizer.from_pretrained(
        modelpath, trust_remote_code=True)
    # Dataset download link:
    # https://huggingface.co/datasets/Salesforce/wikitext/tree/main/wikitext-2-raw-v1
    testenc = load_dataset(
        "parquet", data_files='./wikitext/wikitext-2-raw-1/test-00000-of-00001.parquet', split='train')
    testenc = tokenizer("\n\n".join(
        testenc['text']), return_tensors="pt").input_ids
    nsamples = testenc.numel() // seqlen
    nlls = []
    for i in tqdm(range(nsamples), desc="eval_wikitext: "):
        batch = testenc[:, (i * seqlen): ((i + 1) * seqlen)]
        inputs = {"input_ids": batch}
        lm_logits = llm.get_logits(inputs)
        if lm_logits is None:
            print("get logits failed!")
            return
        shift_logits = lm_logits[:, :-1, :]
        shift_labels = batch[:, 1:].to(lm_logits.device)
        loss_fct = nn.CrossEntropyLoss().to(lm_logits.device)
        loss = loss_fct(
            shift_logits.view(-1, shift_logits.size(-1)), shift_labels.view(-1))
        neg_log_likelihood = loss.float() * seqlen
        nlls.append(neg_log_likelihood)
    ppl = torch.exp(torch.stack(nlls).sum() / (nsamples * seqlen))
    print(f'wikitext-2-raw-1-test ppl: {round(ppl.item(), 2)}')

# eval_wikitext(llm)


# Chat with model
messages = "<|im_start|>system You are a helpful assistant.<|im_end|><|im_start|>user你好！\n<|im_end|><|im_start|>assistant"
kwargs = {"max_length": 128, "top_k": 1, "top_p": 0.8,
          "temperature": 0.8, "do_sample": True, "repetition_penalty": 1.1}
# print(llm.chat_model(messages, kwargs))


# Export rkllm model
ret = llm.export_rkllm("./deepseek-r1.rkllm")
if ret != 0:
    print('Export model failed!')
    exit(ret)

编写量化校正数据集data_quant.json 保存到DeepSeek-R1-Distill-Qwen-1.5B目录下

[{"input":"Human: 你好！\nAssistant: ", "target": "你好！我是人工智能助手！"}]

运行转接脚本transform.py

(RKLLM-Toolkit) chris@bestom-Precision-Tower-7910:~/Projects/DeepSeekDemo/examples/DeepSeek-R1-Distill-Qwen-1.5B_Demo$ python transform.py 
INFO: rkllm-toolkit version: 1.1.4 
The argument `trust_remote_code` is to be used with Auto classes. It has no effect here and is ignored. 
Downloading data files: 100%|████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 1/1 [00:00<00:00, 7157.52it/s] 
Extracting data files: 100%|███████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 1/1 [00:00<00:00, 57.08it/s] 
Generating train split: 1 examples [00:00,  2.34 examples/s] 
Optimizing model: 100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 28/28 [00:40<00:00,  1.44s/it] 
Building model: 100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 399/399 [00:13<00:00, 30.41it/s] 
WARNING: The bos token has two ids: 151646 and 151643, please ensure that the bos token ids in config.json and tokenizer_config.json are consistent! 
INFO: The token_id of bos is set to 151646 
INFO: The token_id of eos is set to 151643 
INFO: The token_id of pad is set to 151643 
Converting model: 100%|████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 339/339 [00:00<00:00, 584169.70it/s] 
INFO: Exporting the model, please wait .... 
[=================================================>] 597/597 (100%) 
INFO: Model has been saved to ./deepseek-r1.rkllm! 
(RKLLM-Toolkit) chris@bestom-Precision-Tower-7910:~/Projects/DeepSeekDemo/examples/DeepSeek-R1-Distill-Qwen-1.5B_Demo$

在这里插入图片描述

六、RK3588端运行demo

使用DeepSeek-R1-Distill-Qwen-1.5B_Demo进行测试验证
• DeepSeek-R1-Distill-Qwen-1.5B_Demo代码路径
cd examples/DeepSeek-R1-Distill-Qwen-1.5B_Demo/deploy
• 译DeepSeek-R1-Distill-Qwen-1.5B_Demo

这里以编译Linux版本为例，下载安装编译需要的交叉编译工具：gcc-arm-10.2-2020.11-x86_64-aarch64-none-linux-gnu

https://developer.arm.com/downloads/-/gnu-a/10-2-2020-11

在这里插入图片描述

tar -xf gcc-arm-10.2-2020.11-x86_64-aarch64-none-linux-gnu.tar.xz

修改编译脚本指定交叉编译工具路径

vi build-linux.sh

在这里插入图片描述
执行build-linux.sh开始编译

(RKLLM-Toolkit) chris@bestom-Precision-Tower-7910:~/Projects/DeepSeekDemo/examples/DeepSeek-R1-Distill-Qwen-1.5B_Demo/deploy$ ./build-linux.sh 
-- The C compiler identification is GNU 10.2.1 
-- The CXX compiler identification is GNU 10.2.1 
-- Check for working C compiler: /home/chris/opts/gcc-arm-10.2-2020.11-x86_64-aarch64-none-linux-gnu/bin/aarch64-none-linux-gnu-gcc 
-- Check for working C compiler: /home/chris/opts/gcc-arm-10.2-2020.11-x86_64-aarch64-none-linux-gnu/bin/aarch64-none-linux-gnu-gcc -- works 
-- Detecting C compiler ABI info 
-- Detecting C compiler ABI info - done 
-- Detecting C compile features 
-- Detecting C compile features - done 
-- Check for working CXX compiler: /home/chris/opts/gcc-arm-10.2-2020.11-x86_64-aarch64-none-linux-gnu/bin/aarch64-none-linux-gnu-g++ 
-- Check for working CXX compiler: /home/chris/opts/gcc-arm-10.2-2020.11-x86_64-aarch64-none-linux-gnu/bin/aarch64-none-linux-gnu-g++ -- works 
-- Detecting CXX compiler ABI info 
-- Detecting CXX compiler ABI info - done 
-- Detecting CXX compile features 
-- Detecting CXX compile features - done 
-- Configuring done 
-- Generating done 
-- Build files have been written to: /home/chris/Projects/DeepSeekDemo/examples/DeepSeek-R1-Distill-Qwen-1.5B_Demo/deploy/build/build_linux_aarch64_Release 
Scanning dependencies of target llm_demo 
[ 50%] Building CXX object CMakeFiles/llm_demo.dir/src/llm_demo.cpp.o 
[100%] Linking CXX executable llm_demo 
[100%] Built target llm_demo 
[100%] Built target llm_demo 
Install the project... 
-- Install configuration: "Release" 
-- Installing: /home/chris/Projects/DeepSeekDemo/examples/DeepSeek-R1-Distill-Qwen-1.5B_Demo/deploy/install/demo_Linux_aarch64/./llm_demo 
-- Set runtime path of "/home/chris/Projects/DeepSeekDemo/examples/DeepSeek-R1-Distill-Qwen-1.5B_Demo/deploy/install/demo_Linux_aarch64/./llm_demo" to "" 
-- Installing: /home/chris/Projects/DeepSeekDemo/examples/DeepSeek-R1-Distill-Qwen-1.5B_Demo/deploy/install/demo_Linux_aarch64/lib/librkllmrt.so

打包编译生成的文件，方便push到设备中

(RKLLM-Toolkit) chris@bestom-Precision-Tower-7910:~/Projects/DeepSeekDemo/examples/DeepSeek-R1-Distill-Qwen-1.5B_Demo/deploy$ tar -zcvf install/demo_Linux_aarch64.tar.gz install/demo_Linux_aarch64/

• 运行llm_demo

# push deepseek-r1.rkllm to device 
C:\Users\king>adb push E:\lhj_files\deepSeekDemo\deepseek-r1.rkllm data/ 
# push install dir to device 
C:\Users\king>adb push E:\lhj_files\deepSeekDemo\demo_Linux_aarch64.tar.gz data/ 
# Unzip the demo 
C:\Users\king>adb shell 
root@linaro-alip:/# cd data 
root@linaro-alip:/data# tar -zxvf demo_Linux_aarch64.tar.gz 
root@linaro-alip:/data# cd install/demo_Linux_aarch64/ 
# Run Demo 
root@linaro-alip:/data/install/demo_Linux_aarch64# export LD_LIBRARY_PATH=./lib 
root@linaro-alip:/data/install/demo_Linux_aarch64# taskset f0 ./llm_demo /data/deepseek-r1.rkllm  2048 4096 
 
# Running result                                                           
rkllm init start 
rkllm init success