本地qwen 大模型，基于FastAPI构建API接口使用

文章目录

- 简介
- 实战
- - API 构建
  - 访问
  - - curl
    - request库
- 结果
- 参考资料

简介

实战

使用modelscope 下载千问7B模型，利用FastAPI部署成在线的API接口；
使用history历史对话多轮问答数据，实现多轮对话；

API 构建

import uvicorn
from fastapi import FastAPI

import os
from pydantic import BaseModel
import uvicorn, json, datetime
import torch 
os.environ['CUDA_VISIBLE_DEVICES'] = "0"


from modelscope import AutoModelForCausalLM, AutoTokenizer, GenerationConfig
from typing import List, Tuple

app = FastAPI()

class Query(BaseModel):
    text: str
    history: list = []

model_name = "qwen/Qwen-7B-Chat"


@app.post("/chat/")
async def chat(query: Query):
    global model, tokenizer  # 声明全局变量以便在函数内部使用模型和分词器
    messages = [
        {"role": "system", "content": "You are a helpful assistant."},
        {"role": "user", "content": query.text}
    ]

    # 此处的prompt template 构建，用不用都行
    text = tokenizer.apply_chat_template(
        messages,
        tokenize=False,
        add_generation_prompt=True
    )

    response, history = model.chat(
        tokenizer,
        text,
        history=query.history,
        max_length=2048,  # 如果未提供最大长度，默认使用2048
        top_p=0.7,  # 如果未提供top_p参数，默认使用0.7
        temperature=0.95  # 如果未提供温度参数，默认使用0.95
    )

    return {
                "result": response,
                "history": history
            }


# 主函数入口
if __name__ == '__main__':
    # 加载预训练的分词器和模型
    tokenizer = AutoTokenizer.from_pretrained(model_name, trust_remote_code=True)
    model = AutoModelForCausalLM.from_pretrained(model_name, 
                                     device_map="auto", 
                                     trust_remote_code=True).eval()
                                     
    model.generation_config = GenerationConfig.from_pretrained(model_name,trust_remote_code=True)		

    model.eval()  # 设置模型为评估模式
    # 启动FastAPI应用
    uvicorn.run(app, host='0.0.0.0', port=6006, workers=1)

访问

class Query(BaseModel):
    text: str
    history: list = []

在Query类中定义了需要传递的参数名，text和history。

curl

!curl -X POST "http://127.0.0.1:6006/chat/" \
     -H 'Content-Type: application/json' \
     -d '{"text": "请问你知道我的名字和年龄吗？", "history": [["你好，我是小明，今年18岁了。", "你好，我是Qwen!"]]}'

在这里插入图片描述

request库

使用requestPOST 传参：

import requests
import json
  
def get_completion(prompt, history=None):
    headers = {'Content-Type': 'application/json'}
    data = {
        "text": prompt,
        "history": history
    }
    response = requests.post(
        url='http://127.0.0.1:6006/chat/',
        headers=headers, 
        data=json.dumps(data))
    d = response.json()
    result, history = d['result'], d['history']
    return result, history

history = []
while True:
    key = input('>')
    if key == 'q':
        break
    result, history = get_completion(key, history)
    print(result)

结果

多轮对话效果如下：
在这里插入图片描述

qwen-7B，能够记住我在前面提供的姓名和年龄。还具备基本的逻辑推理能力，根据年龄推测出生的年份。

参考资料

利用FastAPI构建大模型接口服务
0基础！在云上部署Qwen大模型，实现API调用！AutoDL部署大模型实操教程！

本文来自互联网用户投稿，该文观点仅代表作者本人，不代表本站立场。本站仅提供信息存储空间服务，不拥有所有权，不承担相关法律责任。如若转载，请注明出处：/a/494857.html

如若内容造成侵权/违法违规/事实不符，请联系我们进行投诉反馈qq邮箱809451989@qq.com，一经查实，立即删除！

本地qwen 大模型，基于FastAPI构建API接口使用

文章目录

简介

实战

API 构建

访问

curl

request库

结果

参考资料

相关文章

【C语言】Infiniband驱动pci_pcie_cap

vue3+Vite+TS项目，配置ESlint和Prettier

方格分割（蓝桥杯）

蓝桥杯基础练习汇总详细解析（三）——字母图形、01字符串、闰年判断（详细解题思路、代码实现、Python）

汇编语言学习记录 01

SAP系统如何使用中间数据库与其它系统进行数据交互

2024年腾讯云4核8G服务器多少钱一年？买1年送3个月

SOC子模块--Timer

信息系统项目管理师——第9章项目范围管理（重要）

Yarn资源调度器

如何使用VS统计自己的代码量？

Pandas数据清洗

【二叉树】Leetcode 108. 将有序数组转换为二叉搜索树【简单】

npm淘宝镜像源更新

基于java实现的高校二手交易平台

【uniapp】uniapp实现免密登录

【JAVA】多态

【Spring Boot 源码学习】共享 MetadataReaderFactory 上下文初始化器

uniApp使用XR-Frame创建3D场景(4)金属度和粗糙度

面试知识汇总——JVM内存模型