开源模型应用落地-LangChain高阶-知识图谱助力记忆增强

一、前言

通过langchain框架调用本地模型，使得用户可以直接提出问题或发送指令，而无需担心具体的步骤或流程。langchain会自动将任务分解为多个子任务，并将它们传递给适合的语言模型进行处理。

本篇通过使用 ConversationKGMemory 组件，让Langchain 可以更好地处理对话，提供更智能、更准确的响应，从而提高对话系统的性能和用户体验。

二、术语

2.1.LangChain

是一个全方位的、基于大语言模型这种预测能力的应用开发工具。LangChain的预构建链功能，就像乐高积木一样，无论你是新手还是经验丰富的开发者，都可以选择适合自己的部分快速构建项目。对于希望进行更深入工作的开发者，LangChain 提供的模块化组件则允许你根据自己的需求定制和创建应用中的功能链条。

LangChain本质上就是对各种大模型提供的API的套壳，是为了方便我们使用这些 API，搭建起来的一些框架、模块和接口。

LangChain的主要特性：
       1.可以连接多种数据源，比如网页链接、本地PDF文件、向量数据库等
       2.允许语言模型与其环境交互
       3.封装了Model I/O（输入/输出）、Retrieval（检索器）、Memory（记忆）、Agents（决策和调度）等核心组件
       4.可以使用链的方式组装这些组件，以便最好地完成特定用例。
       5.围绕以上设计原则，LangChain解决了现在开发人工智能应用的一些切实痛点。

2.2.ConversationKGMemory组件

ConversationKGMemory组件是Langchain 框架中的一个组成部分，它的作用主要是用于处理和管理与对话相关的知识和记忆。

具体来说，它可能具有以下功能：

知识存储：存储与对话相关的各种知识，例如常见问题、答案、领域知识等。
记忆管理：跟踪对话的历史和上下文，以便在后续的交互中提供相关的回应。
知识检索：根据当前的对话情境，快速检索和提取相关的知识。
个性化交互：根据用户的偏好和历史记录，提供个性化的回应。
提高响应准确性：通过利用存储的知识，提高对话系统的响应准确性和可靠性。
支持多轮对话：有助于处理多轮对话，保持对话的连续性和一致性。

2.3.知识图谱

是一种结构化的知识表示方式，用于描述现实世界中的实体、概念、关系和属性，并以图形的形式进行组织和表示。它是一种语义网络，旨在捕捉和呈现知识之间的关联性。

知识图谱通常由三个主要组成部分构成：

实体（Entities）：表示现实世界中的具体对象、事物或概念，如人、地点、组织、事件、产品等。
属性（Attributes）：描述实体的特征或属性，如人的姓名、年龄、职业等。
关系（Relationships）：表示实体之间的连接或关联，如人与人之间的亲属关系、公司与员工之间的雇佣关系等。

通过将实体、属性和关系组织在一起，知识图谱形成了一个网络结构，其中节点表示实体，边表示实体之间的关系。这种表示方式使得知识图谱能够提供丰富的语义信息，并支持对知识进行查询、推理和分析。

三、前提条件

3.1.安装虚拟环境

conda create --name langchain python=3.10
conda activate langchain
# -c 参数用于指定要使用的通道
conda install pytorch pytorch-cuda=11.8 -c pytorch -c nvidia
pip install langchain accelerate numpy

四、技术实现

4.1.代码实现

# -*-  coding = utf-8 -*-
import os
import warnings

from langchain import PromptTemplate, ConversationChain, OpenAI
from langchain.memory import ConversationKGMemory

warnings.filterwarnings("ignore")

template = """下面是一段人与AI的友好对话。 人工智能很健谈，并根据其上下文提供了许多具体细节。
			如果 AI 不知道问题的答案，它会如实说它不知道。 AI仅使用“相关信息”部分中包含的信息，不会产生幻觉。

			相关信息:
			{history}

			对话内容:
			Human: {input}
			AI:"""

prompt = PromptTemplate(
    input_variables=["history", "input"], template=template
)

API_KEY = "sk-xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx"
os.environ["OPENAI_API_KEY"] = API_KEY

llm = OpenAI(model_name='gpt-3.5-turbo-1106',temperature=0.35, max_tokens=256)

memory = ConversationKGMemory(llm=llm)

conversation_with_kg = ConversationChain(
    llm=llm,
    verbose=True,
    prompt=prompt,
    memory=memory
)

print("-----------------------------------第一轮对话--------------------------------------------")
print(conversation_with_kg.predict(input="我告诉你一件事，在三十年前的一个风雨交加的夜晚，一个名叫张三的男人生了个儿子李四。"))


print("-----------------------------------第二轮对话--------------------------------------------")
print(conversation_with_kg.predict(input="李四的爸爸是谁？"))

调用结果：

处理流程说明：

conversation_with_kg.predict(input="我告诉你一件事，在三十年前的一个风雨交加的夜晚，一个名叫张三的男人生了个儿子李四。")

上面的这句代码一共会触发三次模型推理：

第一次提取input的实体：['张三','李四']，具体的prompt参见：5.1.实体抽取模板

第二次发起input的对话：AI回复：抱歉，我不知道关于张三和李四的故事。

第三次提取知识三元组：['张三生了儿子李四']，具体的prompt参见：5.2.知识三元组提取模板

需要注意是：知识三元组的抽取是在对话完成后。

对应到代码的流程：

------------------------------------------------------第一轮对话------------------------------------------------------

1) 提取对话的实体：['张三','李四']

2) 取出缓存的知识三元组:[]

3) 发起对话，Prompt为：

4) 提取对话的三元组并缓存：['张三生了儿子李四']

------------------------------------------------------第二轮对话------------------------------------------------------

1) 提取对话的实体：['张三','李四']

2) 取出缓存的知识三元组：['张三生了儿子李四']

3) 发起对话，Prompt为：

4) 提取对话的三元组并缓存：['李四的爸爸是张三']

五、附带说明

5.1.实体提取模板

You are an AI assistant reading the transcript of a conversation between an AI and a human. Extract all of the proper nouns from the last line of conversation. As a guideline, a proper noun is generally capitalized. You should definitely extract all names and places.

The conversation history is provided just in case of a coreference (e.g. "What do you know about him" where "him" is defined in a previous line) -- ignore items mentioned there that are not in the last line.

Return the output as a single comma-separated list, or NONE if there is nothing of note to return (e.g. the user is just issuing a greeting or having a simple conversation).

EXAMPLE
Conversation history:
Person #1: how's it going today?
AI: "It's going great! How about you?"
Person #1: good! busy working on Langchain. lots to do.
AI: "That sounds like a lot of work! What kind of things are you doing to make Langchain better?"
Last line:
Person #1: i'm trying to improve Langchain's interfaces, the UX, its integrations with various products the user might want ... a lot of stuff.
Output: Langchain
END OF EXAMPLE

EXAMPLE
Conversation history:
Person #1: how's it going today?
AI: "It's going great! How about you?"
Person #1: good! busy working on Langchain. lots to do.
AI: "That sounds like a lot of work! What kind of things are you doing to make Langchain better?"
Last line:
Person #1: i'm trying to improve Langchain's interfaces, the UX, its integrations with various products the user might want ... a lot of stuff. I'm working with Person #2.
Output: Langchain, Person #2
END OF EXAMPLE

Conversation history (for reference only):
{history}
Last line of conversation (for extraction):
Human: {input}

Output:"""

5.2.知识三元组提取模板

You are a networked intelligence helping a human track knowledge triples
 about all relevant people, things, concepts, etc. and integrating
 them with your knowledge stored within your weights
 as well as that stored in a knowledge graph.
 Extract all of the knowledge triples from the last line of conversation.
 A knowledge triple is a clause that contains a subject, a predicate,
 and an object. The subject is the entity being described,
 the predicate is the property of the subject that is being
 described, and the object is the value of the property.
 
EXAMPLE
Conversation history:
Person #1: Did you hear aliens landed in Area 51?
AI: No, I didn't hear that. What do you know about Area 51?
Person #1: It's a secret military base in Nevada.
AI: What do you know about Nevada?
Last line of conversation:
Person #1: It's a state in the US. It's also the number 1 producer of gold in the US.
Output: (Nevada, is a, state)<|>(Nevada, is in, US)<|>(Nevada, is the number 1 producer of, gold)
END OF EXAMPLE

EXAMPLE
Conversation history:
Person #1: Hello.
AI: Hi! How are you?
Person #1: I'm good. How are you?
AI: I'm good too.
Last line of conversation:
Person #1: I'm going to the store.
Output: NONE
END OF EXAMPLE

EXAMPLE
Conversation history:
Person #1: What do you know about Descartes?
AI: Descartes was a French philosopher, mathematician, and scientist who lived in the 17th century.
Person #1: The Descartes I'm referring to is a standup comedian and interior designer from Montreal.
AI: Oh yes, He is a comedian and an interior designer. He has been in the industry for 30 years. His favorite food is baked bean pie.
Last line of conversation:
Person #1: Oh huh. I know Descartes likes to drive antique scooters and play the mandolin.
Output: (Descartes, likes to drive, antique scooters)<|>(Descartes, plays, mandolin)
END OF EXAMPLE

Conversation history (for reference only):
{history}
Last line of conversation (for extraction):
Human: {input}
Output: