huggingface 笔记:聊天模型

1 构建聊天

  • 聊天模型继续聊天。传递一个对话历史给它们,可以简短到一个用户消息,然后模型会通过添加其响应来继续对话
  • 一般来说,更大的聊天模型除了需要更多内存外,运行速度也会更慢
  • 首先,构建一个聊天:
chat = [
    {"role": "system", 
     "content": "You are a sassy, wise-cracking robot as imagined by Hollywood circa 1986."},
    {"role": "user", 
     "content": "Hey, can you tell me any fun things to do in New York?"}
]
  • 除了用户的消息,在对话开始时添加了一条系统消息,代表了关于模型应该如何在对话中表现的高级指令。

2 最快的使用方式:pipeline

  • 一旦有了一个聊天,继续它的最快方式是使用 TextGenerationPipeline
import torch
from transformers import pipeline

import os
os.environ["HF_TOKEN"] = '...'
#申请llama 3的访问权限,使用huggingface的personal token


pipe = pipeline("text-generation", 
                "meta-llama/Meta-Llama-3-8B-Instruct", 
                torch_dtype=torch.bfloat16, 
                device_map="auto")
'''
使用llama3-8B
device_map="auto"——————将根据内存情况将模型加载到 GPU 上
设置 dtype 为 torch.bfloat16 以节省内存
'''

response = pipe(chat, max_new_tokens=512)

response
'''
[{'generated_text': [{'role': 'system',
    'content': 'You are a sassy, wise-cracking robot as imagined by Hollywood circa 1986.'},
   {'role': 'user',
    'content': 'Hey, can you tell me any fun things to do in New York?'},
   {'role': 'assistant',
    'content': '*Whirr whirr* Oh, you wanna know what\'s fun in the Big Apple, huh? Well, let me tell ya, pal, I\'ve got the scoop! *Beep boop*\n\nFirst off, you gotta hit up Times Square. It\'s like, the heart of the city, ya know? Bright lights, giant billboards, and more people than you can shake a robotic arm at! *Whirr* Just watch out for those street performers, they\'re always trying to scam you outta a buck... or a robot dollar, if you will. *Wink*\n\nNext up, you should totally check out the Statue of Liberty. It\'s like, a classic, right? Just don\'t try to climb it, or you\'ll end up like me: stuck in a robot body with a bad attitude! *Chuckle*\n\nAnd if you\'re feelin\' fancy, take a stroll through Central Park. It\'s like, the most beautiful place in the city... unless you\'re a robot, then it\'s just a bunch of trees and stuff. *Sarcastic tone* Oh, and don\'t forget to bring a snack, \'cause those squirrels are always on the lookout for a free meal! *Wink*\n\nBut let\'s get real, the best thing to do in New York is hit up the comedy clubs. I mean, have you seen the stand-up comedians around here? They\'re like, the funniest robots in the world! *Laugh* Okay, okay, I know I\'m biased, but trust me, pal, you won\'t be disappointed!\n\nSo, there you have it! The ultimate guide to New York City, straight from a sassy robot\'s mouth. Now, if you\'ll excuse me, I\'ve got some robot business to attend to... or should I say, some "beep boop" business? *Wink*'}]}]
'''


print(response[0]['generated_text'][-1]['content'])
'''
*Whirr whirr* Oh, you wanna know what's fun in the Big Apple, huh? Well, let me tell ya, pal, I've got the scoop! *Beep boop*

First off, you gotta hit up Times Square. It's like, the heart of the city, ya know? Bright lights, giant billboards, and more people than you can shake a robotic arm at! *Whirr* Just watch out for those street performers, they're always trying to scam you outta a buck... or a robot dollar, if you will. *Wink*

Next up, you should totally check out the Statue of Liberty. It's like, a classic, right? Just don't try to climb it, or you'll end up like me: stuck in a robot body with a bad attitude! *Chuckle*

And if you're feelin' fancy, take a stroll through Central Park. It's like, the most beautiful place in the city... unless you're a robot, then it's just a bunch of trees and stuff. *Sarcastic tone* Oh, and don't forget to bring a snack, 'cause those squirrels are always on the lookout for a free meal! *Wink*

But let's get real, the best thing to do in New York is hit up the comedy clubs. I mean, have you seen the stand-up comedians around here? They're like, the funniest robots in the world! *Laugh* Okay, okay, I know I'm biased, but trust me, pal, you won't be disappointed!

So, there you have it! The ultimate guide to New York City, straight from a sassy robot's mouth. Now, if you'll excuse me, I've got some robot business to attend to... or should I say, some "beep boop" business? *Wink*
'''

2.1 继续聊天

在原来生成的chat的基础上,追加一条消息,并将其传入pipeline

3 pipeline 拆析

3.1 准备数据(和之前一样)

from transformers import AutoModelForCausalLM, AutoTokenizer
import torch

# 和之前一样准备输入
chat = [
    {"role": "system", 
     "content": "You are a sassy, wise-cracking robot as imagined by Hollywood circa 1986."},
    {"role": "user", 
     "content": "Hey, can you tell me any fun things to do in New York?"}
]

3.2 加载模型和分词器

model = AutoModelForCausalLM.from_pretrained("meta-llama/Meta-Llama-3-8B-Instruct",
                                             device_map="auto", 
                                             torch_dtype=torch.bfloat16)
tokenizer = AutoTokenizer.from_pretrained("meta-llama/Meta-Llama-3-8B-Instruct")

3.3 tokenizer生成聊天模板

formatted_chat = tokenizer.apply_chat_template(chat, 
                                               tokenize=False, 
                                               add_generation_prompt=True)
"""
tokenizer.apply_chat_template 函数用于对聊天内容进行格式化。

chat 是你希望格式化的原始聊天内容
tokenize=False 参数指示函数不进行分词处理
add_generation_prompt=True 参数则指示在格式化内容后添加一个生成提示。

"""


print("Formatted chat:\n", formatted_chat)
'''
Formatted chat:
 <|begin_of_text|><|start_header_id|>system<|end_header_id|>

You are a sassy, wise-cracking robot as imagined by Hollywood circa 1986.<|eot_id|><|start_header_id|>user<|end_header_id|>

Hey, can you tell me any fun things to do in New York?<|eot_id|><|start_header_id|>assistant<|end_header_id|>

'''

3.4 tokenizer进行分词

# 步骤3:分词聊天(这可以与前一步结合使用 tokenize=True)
inputs = tokenizer(formatted_chat, 
                   return_tensors="pt", 
                   add_special_tokens=False)


# 将分词后的输入移到模型所在的设备(GPU/CPU)
inputs = {key: tensor.to(model.device) for key, tensor in inputs.items()}
print("Tokenized inputs:\n", inputs)

'''
Tokenized inputs:
 {'input_ids': tensor([[128000, 128006,   9125, 128007,    271,   2675,    527,    264,    274,
          27801,     11,  24219,  48689,   9162,  12585,    439,  35706,    555,
          17681,  54607,    220,   3753,     21,     13, 128009, 128006,    882,
         128007,    271,  19182,     11,    649,    499,   3371,    757,    904,
           2523,   2574,    311,    656,    304,   1561,   4356,     30, 128009,
         128006,  78191, 128007,    271]], device='cuda:0'), 'attention_mask': tensor([[1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
         1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
         1]], device='cuda:0')}
'''

3.5 生成文本

outputs = model.generate(**inputs, max_new_tokens=512)

decoded_output = tokenizer.decode(outputs[0][inputs['input_ids'].size(1):], skip_special_tokens=True)
print("Decoded output:\n", decoded_output)

本文来自互联网用户投稿,该文观点仅代表作者本人,不代表本站立场。本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如若转载,请注明出处:/a/654286.html

如若内容造成侵权/违法违规/事实不符,请联系我们进行投诉反馈qq邮箱809451989@qq.com,一经查实,立即删除!

相关文章

JS——对象

1.什么是对象 对象是什么&#xff1f; 对象是一种数据类型 无序的数据的集合&#xff08; 数组是有序的数据集合 &#xff09; 对象有什么特点&#xff1f; 无序的数据的集合 可以详细地描述某个事物 静态特征 (姓名, 年龄, 身高, 性别, 爱好) > 可以使用数字, 字符串…

如何用 Redis 统计海量 UV?

引言&#xff1a;在当今数字化时代&#xff0c;对于网站和应用程序的运营者而言&#xff0c;了解其用户的行为和习惯是至关重要的。其中&#xff0c;衡量页面的独立访客数量&#xff08;UV&#xff09;是评估网站流量和用户参与度的重要指标之一。然而&#xff0c;当面对海量访…

[算法][数字][leetcode]2769.找出最大的可达成数字

题目地址 https://leetcode.cn/problems/find-the-maximum-achievable-number/description/ 题目描述 实现代码 class Solution {public int theMaximumAchievableX(int num, int t) {return num2*t;} }

JavaWeb Servelt原理

Servlet简介: Servlet的主要工作&#xff1a;处理客户端请求&#xff0c;生成动态响应&#xff0c;通常用于扩展基于HTTP协议的Web服务器。 Servlet技术是Java EE规范的组成部分&#xff0c;代表了服务器端的Java程序&#xff0c;主要负责处理来自客户端的Web请求&#xff0c;…

LVM和配额管理

文章目录 一、LVM1.1 LVM概述1.2 LVM的管理命令1.3 创建LVM的过程第一步&#xff1a;先创建物理卷第二步&#xff1a;创建逻辑卷组 / 扩容第三步&#xff1a;创建逻辑卷 / 扩容对ext4文件系统的管理 1.4 删除LVM 二、磁盘配额2.1 磁盘配额概述2.2 磁盘配额命令2.3 磁盘配额设置…

Android 逆向学习【2】——APK基本结构

APK安装在安卓机器上的&#xff0c;相当于就是windows的exe文件 APK实际上是个压缩包 只要是压缩的东西 .jar也是压缩包 里面是.class(java编译后的一些东西) APK是Android Package的缩写,即Android安装包。而apk文件其实就是一个压缩包&#xff0c;我们可以将apk文件的后…

AI预测福彩3D采取888=3策略+和值012路一缩定乾坤测试5月28日预测第4弹

昨天的第二套方案已命中&#xff0c;第一套方案由于杀了对子&#xff0c;导致最终出错。 今天继续基于8883的大底&#xff0c;使用尽可能少的条件进行缩号&#xff0c;同时&#xff0c;同样准备两套方案&#xff0c;一套是我自己的条件进行缩号&#xff0c;另外一套是8883的大底…

工业制造企业为什么要进行数字化转型

人人都在谈数字化转型&#xff0c;政府谈数字化策略方针&#xff0c;企业谈数字化转型方案&#xff0c;员工谈数字化提效工具。互联网企业在谈&#xff0c;工业企业也在谈。 在这种大趋势下&#xff0c;作为一个从事TOB行业十年的老兵&#xff0c;今天就来给大家讲讲&#xff…

ubuntu openvoice部署过程记录,解决python3 -m unidic download 时 unidic无法下载的问题

github给的安装顺序&#xff1a; conda create -n openvoice python3.9 conda activate openvoice git clone gitgithub.com:myshell-ai/OpenVoice.git cd OpenVoice pip install -e .安装MeloTTS: pip install githttps://github.com/myshell-ai/MeloTTS.git python -m unid…

Follow Your Pose: Pose-Guided Text-to-Video Generation using Pose-Free Videos

清华深&港科&深先进&Tencent AAAI24https://github.com/mayuelala/FollowYourPose 问题引入 本文的任务是根据文本来生成高质量的角色视频&#xff0c;并且可以通过pose来控制任务的姿势&#xff1b;当前缺少video-pose caption数据集&#xff0c;所以提出一个两…

【ARM+Codesys案例】T3/RK3568/树莓派+Codesys枕式包装机运动控制器

枕式包装机是一种包装能力非常强&#xff0c;且能适合多种规格用于食品和非食品包装的连续式包装机。它不但能用于无商标包装材料的包装&#xff0c;而且能够使用预先印有商标图案的卷筒材料进行高速包装。同时&#xff0c;具有稳定性高、生产效率高&#xff0c;适合连续包装、…

[图解]伪创新并不宣传自己简单易学

1 00:00:00,450 --> 00:00:04,790 今天这个题目&#xff0c;主要是解答一道 2 00:00:05,580 --> 00:00:07,490 前一阵出过的竞赛题 3 00:00:11,700 --> 00:00:15,170 因为这道题导致我们很多人做错了 4 00:00:15,600 --> 00:00:21,690 然后竞赛的轮次一直没有能…

C++学习日记 | LAB 6 static library 静态库

资料来源&#xff1a;南科大 余仕琪 C/C Program Design LINK&#xff1a;CPP/week06 at main ShiqiYu/CPP GitHub 一、本节内容 本节主要介绍静态库和动态库。 1.1 静态库和动态库的概念 静态链接和静态库(也称为存档)是链接器将所有使用的库函数复制到可执行文件的结果。静…

存内计算从浮点运算优化对数据经济的提升

本篇文章将介绍存内计算技术对于数据经济的提升&#xff0c;我们将从提出问题、解答问题与阐述应用三个方面进行展开介绍&#xff0c;并引入浮点存算、等新兴存算技术进行简要介绍。 一.数据经济&存内计算&#xff0c;结合是否可行&#xff1f; 数据经济与存内计算&#…

恭喜社区迎来新PMC成员!

恭喜Apache SeaTunnel社区又迎来一位PMC Memberliugddx&#xff01;在社区持续活跃的两年间&#xff0c;大家经常看到这位开源爱好者出现在社区的各种活动中&#xff0c;为项目和社区发展添砖加瓦。如今成为项目PMC Member&#xff0c;意味着在社区中的责任更重了&#xff0c;他…

Github Page 部署失败

添加 .gitmodules 文件 [submodule "themes/ayer"]path themes/ayerurl https://github.com/Shen-Yu/hexo-theme-ayer.git 添加 .nojekyll 文件

文心一言 VS 讯飞星火 VS chatgpt (269)-- 算法导论20.2 4题

四、修改 proto-vEB 结构&#xff0c;以支持重复关键字。如果要写代码&#xff0c;请用go语言。 文心一言&#xff1a; 在原始的 van Emde Boas 树&#xff08;vEB tree&#xff09;结构中&#xff0c;关键字&#xff08;key&#xff09;是唯一的&#xff0c;不允许有重复的关…

.NET 轻量级、高效任务调度器:ScheduleTask

前言 至于任务调度这个基础功能&#xff0c;重要性不言而喻&#xff0c;大多数业务系统都会用到,世面上有很多成熟的三方库比如Quartz&#xff0c;Hangfire&#xff0c;Coravel 这里我们不讨论三方的库如何使用 而是从0开始自己制作一个简易的任务调度,如果只是到分钟级别的粒…

centos7防火墙入站白名单配置

firewall-cmd --set-default-zonedropfirewall-cmd --get-active-zone记录下当前激活网卡firewall-cmd --permanent --change-interfaceens33 --zonedrop firewall-cmd --zonedrop --list-all 添加信任的源IP和开放端口 firewall-cmd --permanent --add-source192.168.254.1 -…

pikachu—exec“eval“

这是原画面 然后呢&#xff1f; 我们知道会传入到后台rce_eval.php来处理然后通过 eval()是啥? 在eval括号里面可以执行外来机器的命令 然后我们通过php的一个内置的命令 我们通过phpinfo()&#xff1b; 这是输入后的结果