开源模型应用落地-CodeQwen模型小试-小试牛刀（一）

一、前言

代码专家模型是基于人工智能的先进技术，它能够自动分析和理解大量的代码库，并从中学习常见的编码模式和最佳实践。这种模型可以提供准确而高效的代码建议，帮助开发人员在编写代码时避免常见的错误和陷阱。

通过学习代码专家模型，开发人员可以获得高效、准确和个性化的代码支持。这不仅可以提高工作效率，还可以在不同的技术环境中简化软件开发工作流程。代码专家模型的引入将为开发人员带来更多的机会去关注创造性的编程任务，从而推动软件开发的创新和进步。

二、术语

2.1.CodeQwen1.5

基于 Qwen 语言模型初始化，拥有 7B 参数的模型，其拥有 GQA 架构，经过了 ~3T tokens 代码相关的数据进行预训练，共计支持 92 种编程语言、且最长支持 64K 的上下文输入。效果方面，CodeQwen1.5 展现出了非凡的代码生成、长序列建模、代码修改、SQL 能力等,该模型可以大大提高开发人员的工作效率，并在不同的技术环境中简化软件开发工作流程。

CodeQwen 是基础的 Coder

代码生成是大语言模型的关键能力之一，期待模型将自然语言指令转换为具有精确的、可执行的代码。仅拥有 70 亿参数的 CodeQwen1.5 在基础代码生成能力上已经超过了更尺寸的模型，进一步缩小了开源 CodeLLM 和 GPT-4 之间编码能力的差距。

CodeQwen 是长序列 Coder

长序列能力对于代码模型来说至关重要，是理解仓库级别代码、成为 Code Agent 的核心能力。而当前的代码模型对于长度的支持仍然非常有限，阻碍了其实际应用的潜力。CodeQwen1.5 希望进一步推进开源代码模型在长序列建模上的进展，我们收集并构造了仓库级别的长序列代码数据进行预训练，通过精细的数据配比和组织方式，使其最终可以最长支持 64K 的输入长度。

CodeQwen 是优秀的代码修改者

一个好的代码助手不仅可以根据指令生成代码，还能够针对已有代码或者新的需求进行修改或错误修复。

CodeQwen 是出色的 SQL 专家

CodeQwen1.5 可以作为一个智能的 SQL 专家，弥合了非编程专业人士与高效数据交互之间的差距。它通过自然语言使无编程专业知识的用户能够查询数据库，从而缓解了与SQL相关的陡峭学习曲线。

2.2.CodeQwen1.5-7B-Chat

CodeQwen1.5 is the Code-Specific version of Qwen1.5. It is a transformer-based decoder-only language model pretrained on a large amount of data of codes.

Strong code generation capabilities and competitve performance across a series of benchmarks;
Supporting long context understanding and generation with the context length of 64K tokens;
Supporting 92 coding languages
Excellent performance in text-to-SQL, bug fix, etc.

三、前置条件

3.1.基础环境

操作系统：centos7

Tesla V100-SXM2-32GB CUDA Version: 12.2

3.2.下载模型

huggingface：

https://huggingface.co/Qwen/CodeQwen1.5-7B-Chat/tree/main

ModelScope：

git clone https://www.modelscope.cn/qwen/CodeQwen1.5-7B-Chat.git

PS：

1. 根据实际情况选择不同规格的模型

3.3.更新transformers库

pip install --upgrade transformers==4.38.1

四、使用方式

4.1.生成代码能力

# -*-  coding = utf-8 -*-
from transformers import AutoModelForCausalLM, AutoTokenizer, GenerationConfig

device = "cuda" 

modelPath='/model/CodeQwen1.5-7B-Chat'

def loadTokenizer():
    # print("loadTokenizer: ", modelPath)
    tokenizer = AutoTokenizer.from_pretrained(modelPath)
    return tokenizer

def loadModel(config):
    print("loadModel: ",modelPath)
    model = AutoModelForCausalLM.from_pretrained(
        modelPath,
        torch_dtype="auto",
        device_map="auto"
    )
    model.generation_config = config
    return model


if __name__ == '__main__':
    prompt = "用Python写一个冒泡排序算法的例子"
    messages = [
        {"role": "system", "content": "You are a helpful assistant."},
        {"role": "user", "content": prompt}
    ]

    config = GenerationConfig.from_pretrained(modelPath, top_p=0.9, temperature=0.7, repetition_penalty=1.1,
                                              do_sample=True, max_new_tokens=8192)
    tokenizer = loadTokenizer()
    model = loadModel(config)

    text = tokenizer.apply_chat_template(
        messages,
        tokenize=False,
        add_generation_prompt=True
    )
    model_inputs = tokenizer([text], return_tensors="pt").to(device)

    generated_ids = model.generate(
        model_inputs.input_ids
    )
    generated_ids = [
        output_ids[len(input_ids):] for input_ids, output_ids in zip(model_inputs.input_ids, generated_ids)
    ]

    response = tokenizer.batch_decode(generated_ids, skip_special_tokens=True)[0]
    print(response)

调用结果：

在IDEA中运行模型生成的代码

结论：

模型能根据需求生成可运行代码

4.2.修改代码的能力

示例说明：

把冒泡排序正确的代码故意修改为错误,异常为：UnboundLocalError: local variable 'j' referenced before assignment

# -*-  coding = utf-8 -*-
from transformers import AutoModelForCausalLM, AutoTokenizer, GenerationConfig

device = "cuda" 

modelPath='/model/CodeQwen1.5-7B-Chat'

def loadTokenizer():
    # print("loadTokenizer: ", modelPath)
    tokenizer = AutoTokenizer.from_pretrained(modelPath)
    return tokenizer

def loadModel(config):
    # print("loadModel: ",modelPath)
    model = AutoModelForCausalLM.from_pretrained(
        modelPath,
        torch_dtype="auto",
        device_map="auto"
    )
    model.generation_config = config
    return model


if __name__ == '__main__':
    prompt = '''
我用Python写了一个冒泡排序的算法例子，但是运行结果不符合预期，请修改，具体代码如下:
def bubble_sort(nums):
    n = len(nums)
    for i in range(n):
        for j in range(0, n-i-1):
            if nums[j] < nums[j+1]:
                nums[j], nums[j+1] = nums[j+1], nums[j]
    return nums

if __name__ == "__main__":
    unsorted_list = [64, 34, 25, 12, 22, 11, 90]
    print("原始列表：", unsorted_list)
    sorted_list = bubble_sort(unsorted_list)
    print("排序后的列表：", sorted_list)         
'''
    
    messages = [
        {"role": "system", "content": "You are a helpful assistant."},
        {"role": "user", "content": prompt}
    ]

    config = GenerationConfig.from_pretrained(modelPath, top_p=0.9, temperature=0.7, repetition_penalty=1.1,
                                              do_sample=True, max_new_tokens=8192)
    tokenizer = loadTokenizer()
    model = loadModel(config)

    text = tokenizer.apply_chat_template(
        messages,
        tokenize=False,
        add_generation_prompt=True
    )
    model_inputs = tokenizer([text], return_tensors="pt").to(device)

    generated_ids = model.generate(
        model_inputs.input_ids
    )
    generated_ids = [
        output_ids[len(input_ids):] for input_ids, output_ids in zip(model_inputs.input_ids, generated_ids)
    ]

    response = tokenizer.batch_decode(generated_ids, skip_special_tokens=True)[0]
    print(response)

调用结果：