使用 Elastic 和 Apple 的 OpenELM 模型构建 RAG 系统

作者:来自 Elastic Gustavo Llermaly

如何部署和测试新的 Apple 模型并使用 Elastic 构建 RAG 系统。

在本文中,我们将学习部署和测试新的 Apple 模型,并构建一个 RAG 系统来模拟 Apple Intelligence,使用 Elastic 作为向量数据库,OpenELM 作为模型提供者。

这里有一个包含完整练习的笔记本。

简介

4 月,Apple 发布了其开放高效语言模型 (OpenELM),其参数有 2.7 亿、4.5 亿、1.1 亿和 3 亿,包括聊天(chat)和指令(instruct)版本。参数较大的模型通常更适合执行复杂任务,但速度较慢且耗费更多资源,而参数较小的模型则速度更快、要求更低。选择取决于我们想要解决的问题。

创建者从研究的角度强调了该模型的相关性,他们提供了训练模型所需的一切,并且在某些情况下展示了他们的模型如何以更少的参数获得比竞争对手更高的性能。

这些模型的显著特点是透明性,因为复制它们所需的一切都是开放的,而和那些只提供模型权重和推理代码并在私有数据集上进行预训练的模型则不同。

来源: https://arxiv.org/abs/2404.14619

用于生成和训练这些模型的框架 (CoreNet) 也已可用。

OpenELM 模型的优势之一是它们可以迁移到 MLX,MLX 是一个针对配备 Apple Silicon 处理器的设备优化的深度学习框架,因此它们可以通过为这些设备训练本地模型来从这项技术中受益。

Apple 刚刚发布了新款 iPhone,其中一项新功能是 Apple Intelligence,它利用 AI 来帮助完成通知分类、上下文感知推荐和电子邮件编写等任务。

让我们使用 Elastic 和 OpenELM 构建一个应用程序来实现相同的目标!

应用程序流程如下:

步骤

  • 部署模型
  • 索引数据
  • 测试模型

部署模型

第一步是部署模型。你可以在此处找到有关模型的完整信息:https://huggingface.co/collections/apple/openelm-instruct-models-6619ad295d7ae9f868b759ca

我们将使用指令(instruct)模型,因为我们希望我们的模型遵循指令而不是与其交谈。指令模型是针对一次性请求而不是进行对话进行训练的。

首先,我们需要克隆存储库:

git clone https://huggingface.co/apple/OpenELM

然后,你需要在此处获取 HuggingFace 访问 token。

接下来,你需要请求访问 HuggingFace 中的 Llama-2-7b 模型以使用 OpenELM 分词器。

之后,在刚刚克隆的存储库文件夹中运行以下命令:

python generate_openelm.py --model apple/OpenELM-270M-Instruct --hf_access_token [HF_ACCESS_TOKEN] --prompt 'Once upon a time there was' --generate_kwargs repetition_penalty=1.2 prompt_lookup_num_tokens=10

你应该收到类似这样的回复:

Once upon a time there was a man named John Smith. He had been born in the small town of Pine Bluff, Arkansas, and raised by his single mother, Mary Ann. John's family moved to California when he was young, settling in San Francisco where he attended high school. After graduating from high school, John enlisted in the U.S. Army as a machine gunner. John's first assignment took him to Germany, serving with the 1st Battalion, 12th Infantry Regiment. During this time, John learned German and quickly became fluent in the language. In fact, it took him only two months to learn all 3,000 words of the alphabet. John's love for learning led him to attend college at Stanford University, majoring in history. While attending school, John also served as a rifleman in the 1st Armored Division. After completing his undergraduate education, John returned to California to join the U.S. Navy. Upon his return to California, John married Mary Lou, a local homemaker. They raised three children: John Jr., Kathy, and Sharon. John enjoyed spending time with

完成了!我们可以使用命令行发送指令,但是我们希望模型使用我们的信息。

索引数据

现在,我们将在 Elastic 中索引一些文档,以便与模型一起使用。

要充分利用语义搜索的强大功能,请确保使用推理端点部署 ELSER 模型:

PUT _inference/sparse_embedding/my-elser-model 
{
  "service": "elser", 
  "service_settings": {
    "num_allocations": 1,
    "num_threads": 1
  }
}

如果这是你第一次使用 ELSER,你可能需要等待一段时间。你可以在 Kibana > Machine Learning > Trained Models 中查看部署进度。

提示:如果你还没有部署好自己的 ELSER 模型,那么请详细阅读文章 “Elasticsearch:部署 ELSER - Elastic Learned Sparse EncoderR”。

现在,我们将创建索引,该索引将代表代理可以访问的手机中的数据。

PUT mobile-assistant
{
  "mappings": {
    "properties": {
      "title": {
        "type": "text",
        "analyzer": "english"
      },
      "description": {
        "type": "text",
        "analyzer": "english", 
        "copy_to": "semantic_field"
      },
      "semantic_field": {
        "type": "semantic_text",
        "inference_id": "my-elser-model"
      }
    }
  }
}

我们使用 copy_to 设置全文搜索和语义搜索的 description 字段。现在,让我们添加文档:

POST _bulk
{ "index" : { "_index" : "mobile-assistant", "_id": "email1"} }
{ "title": "Team Meeting Agenda", "description": "Hello team, Let's discuss our project progress in tomorrow's meeting. Please prepare your updates. Best regards, Manager" }
{ "index" : { "_index" : "mobile-assistant", "_id": "email2"} }
{ "title": "Client Proposal Draft", "description": "Hi, I've attached the draft of our client proposal. Could you review it and provide feedback? Thanks, Colleague" }
{ "index" : { "_index" : "mobile-assistant", "_id": "email3"} }
{ "title": "Weekly Newsletter", "description": "This week in tech: AI advancements, new smartphone releases, and cybersecurity updates. Read more on our website!" }
{ "index" : { "_index" : "mobile-assistant", "_id": "email4"} }
{ "title": "Urgent: Project Deadline Update", "description": "Dear team, Due to recent developments, we need to move up our project deadline. The new submission date is next Friday. Please adjust your schedules accordingly and let me know if you foresee any issues. We'll discuss this in detail during our next team meeting. Best regards, Project Manager" }
{ "index" : { "_index" : "mobile-assistant", "_id": "email5"} }
{ "title": "Invitation: Company Summer Picnic", "description": "Hello everyone, We're excited to announce our annual company summer picnic! It will be held on Saturday, July 15th, at Sunny Park. There will be food, games, and activities for all ages. Please RSVP by replying to this email with the number of guests you'll be bringing. We look forward to seeing you there! Best, HR Team" }

测试模型

现在我们有了数据和模型,我们只需将两者连接起来,模型就可以完成我们需要的工作。

我们首先创建一个函数来构建我们的系统提示。由于这是一个指令(instruct)模型,它不需要对话,而是接收指令并返回结果。

我们将使用聊天模板(chat template )来格式化提示(prompt)。

def build_prompt(question, elasticsearch_documents):
    docs_text = "\n".join([
        f"Title: {doc['title']}\nDescription: {doc['description']}"
        for doc in elasticsearch_documents
    ])
    
    prompt = f"""<|system|>
    You are Elastic Intelligence (EI), a virtual assistant on a cell phone. Answer questions about emails concisely and accurately. 
    You can only answer based on the context provided by the user.</s>
    <|user|>
    CONTEXT:
    {docs_text}
    
    QUESTION: 
    {question} </s>
    <|assistant|>"""
    
    return prompt
"""

现在,使用语义搜索(semantic search),让我们添加一个根据用户的问题从 Elastic 获取相关文档的功能:

def retrieve_documents(question):
    search_body = {
        "query": {
            "semantic": {
                "query": question,
                "field": "semantic_field"
            }
        }
    }
    response = client.search(index=index_name, body=search_body)
    return [hit["_source"] for hit in response["hits"]["hits"]]

现在,让我们尝试写下:“Summarize my emails”。为了使发送提示更容易,我们将调用文件 generate_openelm.py 中的函数 generate,而不是使用 CLI。

from OpenELM.generate_openelm import generate

output_text, generation_time = generate(
    prompt=prompt,
    model=MODEL,
    hf_access_token=HUGGINGFACE_TOKEN,
)

print("-----GENERATION TIME-----")
print(f'\033[92m {round(generation_time, 2)} \033[0m')
print("-----RESPONSE-----")
print(output_text)

第一个答案各不相同,而且不太好。在某些情况下,我们得到了正确的答案,但在其他情况下则没有。该模型返回了有关其推理、HTML 代码或未在上下文中提及的人的详细信息。

如果我们将问题限制为是/否答案,则模型的表现会更好。这是有道理的,因为它是一个小模型,思考能力较弱。

现在,让我们尝试一个分类任务:

我们看到它只需要一个小循环,但模型能够正确地对电子邮件进行分类。这使得该模型对于按主题或相关性对电子邮件或通知进行分类等任务很有吸引力。另一件需要注意的重要事情是这种模型对提示变化的敏感程度。任务描述方式等小细节可能会使答案有很大差异。

尝试各种不同的 prompt,直到获得所需的结果。

结论

尽管 OpenLM 模型并不试图在业务层面上竞争,但它们在实验场景中提供了一种有趣的替代方案,因为它们公开提供了完整的训练流程,并且具有高度可定制的框架,可用于你自己的数据。它们是需要离线、定制和高效模型的开发人员的理想选择。

结果可能不如其他模型那么令人印象深刻,但从头开始训练此模型的选项非常有吸引力。此外,使用 CoreNet 将其迁移到 Apple Silicon 的机会为创建针对 Apple 设备的优化本地模型打开了大门。如果你对如何将 Open ELM 迁移到 Silico 处理器感兴趣,请查看此 repo。

Elasticsearch 包含许多新功能,可帮助你为你的用例构建最佳搜索解决方案。深入了解我们的示例笔记本以了解更多信息,开始免费云试用,或立即在你的本地机器上试用 Elastic。

原文:Using Elastic and Apple's OpenELM models for RAG systems - Search Labs

本文来自互联网用户投稿,该文观点仅代表作者本人,不代表本站立场。本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如若转载,请注明出处:/a/924809.html

如若内容造成侵权/违法违规/事实不符,请联系我们进行投诉反馈qq邮箱809451989@qq.com,一经查实,立即删除!

相关文章

springboot336社区物资交易互助平台pf(论文+源码)_kaic

毕 业 设 计&#xff08;论 文&#xff09; 社区物资交易互助平台设计与实现 摘 要 传统办法管理信息首先需要花费的时间比较多&#xff0c;其次数据出错率比较高&#xff0c;而且对错误的数据进行更改也比较困难&#xff0c;最后&#xff0c;检索数据费事费力。因此&#xff…

python爬虫案例——猫眼电影数据抓取之字体解密,多套字体文件解密方法(20)

文章目录 1、任务目标2、网站分析3、代码编写1、任务目标 目标网站:猫眼电影(https://www.maoyan.com/films?showType=2) 要求:抓取该网站下,所有即将上映电影的预约人数,保证能够获取到实时更新的内容;如下: 2、网站分析 进入目标网站,打开开发者模式,经过分析,我…

Flutter 指纹识别

在这篇博客中&#xff0c;我们将介绍如何使用 Flutter 的 local_auth 插件在 Android 和 iOS 设备上实现指纹识别功能。通过这一步一步的实现&#xff0c;我们将学习如何检查设备是否支持生物识别、如何触发指纹验证&#xff0c;并处理可能出现的错误。 效果图&#xff08;因为…

不建模,无代码,如何快速搭建VR虚拟展厅?

不建模、无代码搭建虚拟展厅&#xff0c;可以借助一些专业的虚拟展厅搭建平台或工具来实现。以下是一些具体的步骤和建议&#xff1a; 一、选择平台或工具 首先&#xff0c;需要选择一个适合的平台或工具来搭建虚拟展厅。这些平台通常提供预设的展厅模板、拖拽式编辑工具和丰富…

网络空间安全之一个WH的超前沿全栈技术深入学习之路(13-3)白帽必经之路——如何用Metasploit 渗透到她的心才不会让我释怀

欢迎各位彦祖与热巴畅游本人专栏与博客 你的三连是我最大的动力 以下图片仅代表专栏特色 [点击箭头指向的专栏名即可闪现] 专栏跑道一 ➡️网络空间安全——全栈前沿技术持续深入学习 专栏跑道二 ➡️ 24 Network Security -LJS ​ ​ ​ 专栏跑道三 ➡️ MYSQL REDIS Advan…

深入理解计算机系统,源码到可执行文件翻译过程:预处理、编译,汇编和链接

1.前言 从一个高级语言到可执行程序&#xff0c;要经过预处理、编译&#xff0c;汇编和链接四个过程。大家可以思考下&#xff0c;为什么要有这样的过程&#xff1f; 我们学习计算机之处&#xff0c;就应该了解到&#xff0c;计算机能够识别的只有二进制语言&#xff08;这是…

linux系统清理全部python环境并重装

提问 centos系统清理全部python环境并重装&#xff0c;并且使用宝塔。 解答 要在CentOS系统中彻底清理Python3环境&#xff0c;可以遵循以下步骤&#xff1a; 卸载Python3 使用rpm命令卸载所有与Python3相关的包。这个命令会查询所有已安装的与python3相关的rpm包&#xf…

蓝桥杯——递归

1、用递归实现阶乘 5*4*3*2*1120 package day3;public class Demo6 {public static void main(String[] args) {int result f(5);System.out.println(result);}private static int f(int i) {if(i1) {return 1;}return i * f(i-1);}}结果&#xff1a;120 2、爬楼梯 有一个楼…

DAMODEL丹摩|部署FLUX.1+ComfyUI实战教程

本文仅做测评体验&#xff0c;非广告。 文章目录 1. FLUX.1简介2. 实战2. 1 创建资源2. 1 ComfyUI的部署操作2. 3 部署FLUX.1 3. 测试5. 释放资源4. 结语 1. FLUX.1简介 FLUX.1是由黑森林实验室&#xff08;Black Forest Labs&#xff09;开发的开源AI图像生成模型。它拥有12…

黑马程序员Java项目实战《苍穹外卖》Day02

苍穹外卖-day02 课程内容 新增员工员工分页查询启用禁用员工账号编辑员工导入分类模块功能代码 **功能实现&#xff1a;**员工管理、菜品分类管理。 员工管理效果&#xff1a; 菜品分类管理效果&#xff1a; 1. 新增员工 1.1 需求分析和设计 1.1.1 产品原型 一般在做需求…

《解锁计算机专业宝藏:核心编程语言与学习资料全解析》

在当今数字化浪潮汹涌澎湃、技术迭代日新月异的时代&#xff0c;计算机专业宛如一座蕴藏无尽宝藏与无限机遇的神秘殿堂&#x1f3f0;。对于莘莘学子而言&#xff0c;精准掌握核心编程语言&#xff0c;并手握优质学习资料&#xff0c;恰似寻得开启这扇殿堂大门的秘钥&#xff0c…

【Ubuntu 24.04】How to Install and Use NVM

参考 下载 curl -o- https://raw.githubusercontent.com/nvm-sh/nvm/v0.39.7/install.sh | bash激活 Activate NVM: Once the installation script completes, you need to either close and reopen the terminal or run the following command to use nvm immediately. exp…

【优选算法】位运算

目录 常见位运算总结1、基础位运算2、给一个数n&#xff0c;确定它的二进制位的第x位上是0还是13、将一个数n的二进制位的第x位改成14、将一个数n的二进制位的第x位改成05、位图的思想6、提取一个数n的二进制位中最右侧的17、将一个数n的二进制位中最右侧的1变为08、位运算的优…

systemverilog约束中:=和:/的区别

“x dist { [100:102] : 1, 200 : 2, 300 : 5}” 意味着其值等于100或101或102或200或300其中之一&#xff0c; 其权重比例为1:1:1:2:5 “x dist { [100:102] :/ 1, 200 : 2, 300 : 5}” 意味着等于100&#xff0c;101&#xff0c;102或200&#xff0c;或300其…

06_数据类型

数据类型 数据类型分类 JavaScript 语言的每一个值,都属于某一种数据类型。JavaScript 的数据类型,共有六种。(ES6 又新增了第七种 Symbol 类型的值和第八种 BigInt类型,当前课程暂不涉及) 据类型分类 原始类型(基础类型) var age = 20, var name = 尚学堂"; var le…

芯盾时代的身份安全产品体系

芯盾时代具备全栈零信任身份安全产品和服务能力&#xff1a; 芯盾时代IAM能够适配大企业用户复杂的应用访问需求&#xff0c;提供云端、互联网端、企业内网全场景的身份访问安全接入能力&#xff1b; 芯盾时代IAM能够理解大企业用户的身份差异&#xff0c;为内部用户、合作方和…

【Db First】.NET开源 ORM 框架 SqlSugar 系列

.NET开源 ORM 框架 SqlSugar 系列 【开篇】.NET开源 ORM 框架 SqlSugar 系列【入门必看】.NET开源 ORM 框架 SqlSugar 系列【实体配置】.NET开源 ORM 框架 SqlSugar 系列【Db First】.NET开源 ORM 框架 SqlSugar 系列【Code First】.NET开源 ORM 框架 SqlSugar 系列 &#x1f…

shell综合

声明&#xff01; 学习视频来自B站up主 泷羽sec 有兴趣的师傅可以关注一下&#xff0c;如涉及侵权马上删除文章&#xff0c;笔记只是方便各位师傅的学习和探讨&#xff0c;文章所提到的网站以及内容&#xff0c;只做学习交流&#xff0c;其他均与本人以及泷羽sec团队无关&#…

Ubutuns服务器搭建与维护

1.靶机搭建 首先&#xff0c;安装 Apache2 作为 Web 服务器&#xff1a; sudo apt install apache2 安装完成后&#xff0c;可以启动 Apache 服务并确保它开机自启&#xff1a; sudo systemctl start apache2 sudo systemctl enable apache2然后&#xff0c;你可以通过访问…

003 LVGL相关文件分析

LVGL移植相关文件&#xff1a; 显示设备接口文件 lv_port_disp_templ.c/输入设备接口文件 lv_port_indev_templ.c/h 裁剪、配置文件 lv_conf.h lv_conf.h文件内容介绍&#xff1a; 对应中文翻译版本&#xff1a; #if 1 /* 设置为1&#xff0c;以启…