langchain 部署组件-LangServe

原文:🦜️🏓 LangServe | 🦜️🔗 Langchain

LangServe 

🚩 We will be releasing a hosted version of LangServe for one-click deployments of LangChain applications. Sign up here to get on the waitlist.

Overview​

LangServe helps developers deploy LangChain runnables and chains as a REST API.

This library is integrated with FastAPI and uses pydantic for data validation.

In addition, it provides a client that can be used to call into runnables deployed on a server. A javascript client is available in LangChainJS.

Features​

  • Input and Output schemas automatically inferred from your LangChain object, and enforced on every API call, with rich error messages
  • API docs page with JSONSchema and Swagger (insert example link)
  • Efficient /invoke/batch and /stream endpoints with support for many concurrent requests on a single server
  • /stream_log endpoint for streaming all (or some) intermediate steps from your chain/agent
  • Playground page at /playground with streaming output and intermediate steps
  • Built-in (optional) tracing to LangSmith, just add your API key (see Instructions])
  • All built with battle-tested open-source Python libraries like FastAPI, Pydantic, uvloop and asyncio.
  • Use the client SDK to call a LangServe server as if it was a Runnable running locally (or call the HTTP API directly)
  • LangServe Hub

Limitations​

  • Client callbacks are not yet supported for events that originate on the server
  • OpenAPI docs will not be generated when using Pydantic V2. Fast API does not support mixing pydantic v1 and v2 namespaces. See section below for more details.

Hosted LangServe​

We will be releasing a hosted version of LangServe for one-click deployments of LangChain applications. Sign up here to get on the waitlist.

Security​

  • Vulnerability in Versions 0.0.13 - 0.0.15 -- playground endpoint allows accessing arbitrary files on server. Resolved in 0.0.16.

Installation​

For both client and server:

pip install "langserve[all]"

or pip install "langserve[client]" for client code, and pip install "langserve[server]" for server code.

LangChain CLI 🛠️​

Use the LangChain CLI to bootstrap a LangServe project quickly.

To use the langchain CLI make sure that you have a recent version of langchain-cli installed. You can install it with pip install -U langchain-cli.

langchain app new ../path/to/directory

Examples​

Get your LangServe instance started quickly with LangChain Templates.

For more examples, see the templates index or the examples directory.

Server​

Here's a server that deploys an OpenAI chat model, an Anthropic chat model, and a chain that uses the Anthropic model to tell a joke about a topic.

#!/usr/bin/env python
from fastapi import FastAPI
from langchain.prompts import ChatPromptTemplate
from langchain.chat_models import ChatAnthropic, ChatOpenAI
from langserve import add_routes


app = FastAPI(
  title="LangChain Server",
  version="1.0",
  description="A simple api server using Langchain's Runnable interfaces",
)

add_routes(
    app,
    ChatOpenAI(),
    path="/openai",
)

add_routes(
    app,
    ChatAnthropic(),
    path="/anthropic",
)

model = ChatAnthropic()
prompt = ChatPromptTemplate.from_template("tell me a joke about {topic}")
add_routes(
    app,
    prompt | model,
    path="/joke",
)

if __name__ == "__main__":
    import uvicorn

    uvicorn.run(app, host="localhost", port=8000)

Docs​

If you've deployed the server above, you can view the generated OpenAPI docs using:

⚠️ If using pydantic v2, docs will not be generated for invoke/batch/stream/stream_log. See Pydantic section below for more details.

curl localhost:8000/docs

make sure to add the /docs suffix.

⚠️ Index page / is not defined by design, so curl localhost:8000 or visiting the URL will return a 404. If you want content at / define an endpoint @app.get("/").

Client​

Python SDK

from langchain.schema import SystemMessage, HumanMessage
from langchain.prompts import ChatPromptTemplate
from langchain.schema.runnable import RunnableMap
from langserve import RemoteRunnable

openai = RemoteRunnable("http://localhost:8000/openai/")
anthropic = RemoteRunnable("http://localhost:8000/anthropic/")
joke_chain = RemoteRunnable("http://localhost:8000/joke/")

joke_chain.invoke({"topic": "parrots"})

# or async
await joke_chain.ainvoke({"topic": "parrots"})

prompt = [
    SystemMessage(content='Act like either a cat or a parrot.'),
    HumanMessage(content='Hello!')
]

# Supports astream
async for msg in anthropic.astream(prompt):
    print(msg, end="", flush=True)

prompt = ChatPromptTemplate.from_messages(
    [("system", "Tell me a long story about {topic}")]
)

# Can define custom chains
chain = prompt | RunnableMap({
    "openai": openai,
    "anthropic": anthropic,
})

chain.batch([{ "topic": "parrots" }, { "topic": "cats" }])

In TypeScript (requires LangChain.js version 0.0.166 or later):

import { RemoteRunnable } from "langchain/runnables/remote";

const chain = new RemoteRunnable({
  url: `http://localhost:8000/joke/`,
});
const result = await chain.invoke({
  topic: "cats",
});

Python using requests:

import requests
response = requests.post(
    "http://localhost:8000/joke/invoke/",
    json={'input': {'topic': 'cats'}}
)
response.json()

You can also use curl:

curl --location --request POST 'http://localhost:8000/joke/invoke/' \
    --header 'Content-Type: application/json' \
    --data-raw '{
        "input": {
            "topic": "cats"
        }
    }'

Endpoints​

The following code:

...
add_routes(
  app,
  runnable,
  path="/my_runnable",
)

adds of these endpoints to the server:

  • POST /my_runnable/invoke - invoke the runnable on a single input
  • POST /my_runnable/batch - invoke the runnable on a batch of inputs
  • POST /my_runnable/stream - invoke on a single input and stream the output
  • POST /my_runnable/stream_log - invoke on a single input and stream the output, including output of intermediate steps as it's generated
  • GET /my_runnable/input_schema - json schema for input to the runnable
  • GET /my_runnable/output_schema - json schema for output of the runnable
  • GET /my_runnable/config_schema - json schema for config of the runnable

These endpoints match the LangChain Expression Language interface -- please reference this documentation for more details.

Playground​

You can find a playground page for your runnable at /my_runnable/playground. This exposes a simple UI to configure and invoke your runnable with streaming output and intermediate steps.

Widgets​

The playground supports widgets and can be used to test your runnable with different inputs.

In addition, for configurable runnables, the playground will allow you to configure the runnable and share a link with the configuration:

Sharing

Legacy Chains​

LangServe works with both Runnables (constructed via LangChain Expression Language) and legacy chains (inheriting from Chain). However, some of the input schemas for legacy chains may be incomplete/incorrect, leading to errors. This can be fixed by updating the input_schema property of those chains in LangChain. If you encounter any errors, please open an issue on THIS repo, and we will work to address it.

Deployment​

Deploy to GCP​

You can deploy to GCP Cloud Run using the following command:

gcloud run deploy [your-service-name] --source . --port 8001 --allow-unauthenticated --region us-central1 --set-env-vars=OPENAI_API_KEY=your_key

Pydantic​

LangServe provides support for Pydantic 2 with some limitations.

  1. OpenAPI docs will not be generated for invoke/batch/stream/stream_log when using Pydantic V2. Fast API does not support [mixing pydantic v1 and v2 namespaces].
  2. LangChain uses the v1 namespace in Pydantic v2. Please read the following guidelines to ensure compatibility with LangChain

Except for these limitations, we expect the API endpoints, the playground and any other features to work as expected.

Advanced​

Handling Authentication​

If you need to add authentication to your server, please reference FastAPI's security documentation and middleware documentation.

Files​

LLM applications often deal with files. There are different architectures that can be made to implement file processing; at a high level:

  1. The file may be uploaded to the server via a dedicated endpoint and processed using a separate endpoint
  2. The file may be uploaded by either value (bytes of file) or reference (e.g., s3 url to file content)
  3. The processing endpoint may be blocking or non-blocking
  4. If significant processing is required, the processing may be offloaded to a dedicated process pool

You should determine what is the appropriate architecture for your application.

Currently, to upload files by value to a runnable, use base64 encoding for the file (multipart/form-data is not supported yet).

Here's an example that shows how to use base64 encoding to send a file to a remote runnable.

Remember, you can always upload files by reference (e.g., s3 url) or upload them as multipart/form-data to a dedicated endpoint.

Custom Input and Output Types​

Input and Output types are defined on all runnables.

You can access them via the input_schema and output_schema properties.

LangServe uses these types for validation and documentation.

If you want to override the default inferred types, you can use the with_types method.

Here's a toy example to illustrate the idea:

from typing import Any

from fastapi import FastAPI
from langchain.schema.runnable import RunnableLambda

app = FastAPI()


def func(x: Any) -> int:
    """Mistyped function that should accept an int but accepts anything."""
    return x + 1


runnable = RunnableLambda(func).with_types(
    input_schema=int,
)

add_routes(app, runnable)

Custom User Types​

Inherit from CustomUserType if you want the data to de-serialize into a pydantic model rather than the equivalent dict representation.

At the moment, this type only works server side and is used to specify desired decoding behavior. If inheriting from this type the server will keep the decoded type as a pydantic model instead of converting it into a dict.

from fastapi import FastAPI
from langchain.schema.runnable import RunnableLambda

from langserve import add_routes
from langserve.schema import CustomUserType

app = FastAPI()


class Foo(CustomUserType):
    bar: int


def func(foo: Foo) -> int:
    """Sample function that expects a Foo type which is a pydantic model"""
    assert isinstance(foo, Foo)
    return foo.bar

# Note that the input and output type are automatically inferred!
# You do not need to specify them.
# runnable = RunnableLambda(func).with_types( # <-- Not needed in this case
#     input_schema=Foo,
#     output_schema=int,
# 
add_routes(app, RunnableLambda(func), path="/foo")

Playground Widgets​

The playground allows you to define custom widgets for your runnable from the backend.

  • A widget is specified at the field level and shipped as part of the JSON schema of the input type
  • A widget must contain a key called type with the value being one of a well known list of widgets
  • Other widget keys will be associated with values that describe paths in a JSON object

General schema:

type JsonPath = number | string | (number | string)[];
type NameSpacedPath = { title: string; path: JsonPath }; // Using title to mimick json schema, but can use namespace
type OneOfPath = { oneOf: JsonPath[] };

type Widget = {
    type: string // Some well known type (e.g., base64file, chat etc.)
    [key: string]: JsonPath | NameSpacedPath | OneOfPath;
};

File Upload Widget​

Allows creation of a file upload input in the UI playground for files that are uploaded as base64 encoded strings. Here's the full example.

Snippet:

try:
    from pydantic.v1 import Field
except ImportError:
    from pydantic import Field

from langserve import CustomUserType


# ATTENTION: Inherit from CustomUserType instead of BaseModel otherwise
#            the server will decode it into a dict instead of a pydantic model.
class FileProcessingRequest(CustomUserType):
    """Request including a base64 encoded file."""

    # The extra field is used to specify a widget for the playground UI.
    file: str = Field(..., extra={"widget": {"type": "base64file"}})
    num_chars: int = 100

Example widget:

本文来自互联网用户投稿,该文观点仅代表作者本人,不代表本站立场。本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如若转载,请注明出处:/a/177654.html

如若内容造成侵权/违法违规/事实不符,请联系我们进行投诉反馈qq邮箱809451989@qq.com,一经查实,立即删除!

相关文章

Bracket Sequence ——卡特兰数

平衡括号序列是一个仅由括号"("和")"组成的字符串。 一天&#xff0c;卡罗尔问贝拉长度为2N&#xff08;N对括号&#xff09;的平衡括号序列的数量。作为心算大师&#xff0c;她立刻算出了答案。所以Carol问了一个更难的问题&#xff1a;长度为2N&#xff…

易货:一种新型的商业模式

随着经济的发展和社会的进步&#xff0c;人们对于交易的需求和方式也在不断变化。传统的商业模式已经无法满足人们对于多元化、个性化、高效的需求。在这样的背景下&#xff0c;易货模式逐渐走进人们的视野&#xff0c;成为一种新型的商业模式。 易货模式是一种以物换物的交易方…

Linux超简单部署个人博客

1 安装halo 1.1 切换到超级用户 sudo -i 1.2 新建halo文件夹 mkdir ~/halo && cd ~/halo 1.3 编辑docker-compose.yml文件 vim ~/halo/docker-compose.yml 英文输入法下&#xff0c;按 i version: "3"services:halo:image: halohub/halo:2.10container_…

xss-labs靶场6-10关

文章目录 前言一、靶场6-10关1、关卡62、关卡73、关卡84、关卡95、关卡10 总结 前言 此文章只用于学习和反思巩固xss攻击知识&#xff0c;禁止用于做非法攻击。注意靶场是可以练习的平台&#xff0c;不能随意去尚未授权的网站做渗透测试&#xff01;&#xff01;&#xff01; …

新手必看!!附源码!!STM32通用定时器输出PWM

一、什么是PWM? PWM&#xff08;脉冲宽度调制&#xff09;是一种用于控制电子设备的技术。它通过调整信号的脉冲宽度来控制电压的平均值。PWM常用于调节电机速度、控制LED亮度、产生模拟信号等应用。 二、PWM的原理 PWM的基本原理是通过以一定频率产生的脉冲信号&#xff0…

Godot

前言 为什么要研究开源引擎 主要原因有&#xff1a; 可以享受“信创”政策的红利&#xff0c;非常有利于承接政府项目。中美脱钩背景下&#xff0c;国家提出了“信创”政策。这个政策的核心就是&#xff0c;核心技术上自主可控。涉及的产业包括&#xff1a;芯片、操作系统、数据…

LeetCode59.螺旋矩阵

LeetCode59.螺旋矩阵 1.问题描述2.解题思路3.代码 1.问题描述 给你一个正整数 n &#xff0c;生成一个包含 1 到 n2 所有元素&#xff0c;且元素按顺时针顺序螺旋排列的 n x n 正方形矩阵 matrix 。 示例 1&#xff1a; 输入&#xff1a;n 3 输出&#xff1a;[[1,2,3],[8,9,…

Python BDD之Behave测试报告

behave 本身的测试报告 behave 本身提供了四种报告格式&#xff1a; pretty&#xff1a;这是默认的报告格式&#xff0c;提供颜色化的文本输出&#xff0c;每个测试步骤的结果都会详细列出。plain&#xff1a;这也是一种文本格式的报告&#xff0c;但没有颜色&#xff0c;并且…

Mac中LaTex无法编译的问题

最近在使用TexStudio时&#xff0c;遇到一个棘手的问题&#xff1a; 无法编译&#xff0c;提示如下&#xff1a; kpathsea: Running mktexfmt xelatex.fmt /Library/TeX/texbin/mktexfmt: kpsewhich -var-valueTEXMFROOT failed, aborting early. BEGIN failed–compilation a…

超详细!新手必看!STM32-通用定时器简介与知识点概括

一、通用定时器的功能 在基本定时器功能的基础上新增功能&#xff1a; 通用定时器有4个独立通道&#xff0c;且每个通道都可以用于下面功能。 &#xff08;1&#xff09;输入捕获&#xff1a;测量输入信号的周期和占空比等。 &#xff08;2&#xff09;输出比较&#xff1a;产…

第2关:可变长整型数组类(成长版)

题目&#xff1a; 给出的头文件&#xff1a; #include <iostream> #include "array.h" using namespace std;int main() {Array a1;int n;cin >> n;for (int i 0;i ! n; i) {int t;cin >> t;a1.Push_back(t);}Array a2(a1);cout << "…

论文阅读 Forecasting at Scale (二)

最近在看时间序列的文章&#xff0c;回顾下经典 论文地址 项目地址 Forecasting at Scale 3.2、季节性 3.3、假日和活动事件3.4、模型拟合3.5、分析师参与的循环建模4、自动化预测评估4.1、使用基线预测4.2、建模预测准确性4.3、模拟历史预测4.4、识别大的预测误差 5、结论6、致…

【SpringCloud微服务全家桶学习笔记-Hystrix(服务降级,熔断,接近实时的监控,服务限流等)】

服务雪崩 &#xff08;微服务面临的问题&#xff09; 多个微服务之间调用的时候&#xff0c;假设微服务A调用微服务B和微服务C&#xff0c;微服务B和微服务C又调用其它的微服务&#xff0c;这就是所谓的“扇出”。如果扇出的链路上某个微服务的调用响应时间过长或者不可用&…

性能测试 —— Jmeter定时器

固定定时器 如果你需要让每个线程在请求之前按相同的指定时间停顿&#xff0c;那么可以使用这个定时器&#xff1b;需要注意的是&#xff0c;固定定时器的延时不会计入单个sampler的响应时间&#xff0c;但会计入事务控制器的时间 1、使用固定定时器位置在http请求中&#xff…

GB28181学习(十七)——基于jrtplib实现tcp被动和主动发流

前言 GB/T28181-2022实时流的传输方式介绍&#xff1a;https://blog.csdn.net/www_dong/article/details/134255185 基于jrtplib实现tcp被动和主动收流介绍&#xff1a;https://blog.csdn.net/www_dong/article/details/134451387 本文主要介绍下级平台或设备发流功能&#…

做自动驾驶的同学看过来:场景理解、辅助功能、导航、寻路、避障数据集

SANPO&#xff1a;第一个具有大规模密集全景分割和深度注释的人类以自我中心的视频数据集&#xff0c;有助于推动视频分割、深度估计、多任务视觉建模和合成到真实域适应任务发展&#xff0c;同时支持人类导航系统&#xff0c; SANPO&#xff1a;一个大规模的以自我为中心的视…

【20年扬大真题】试写一算法在带头结点的单链表结构上实现线性表操作LENGTH(L)

【20年扬大真题】 试写一算法在带头结点的单链表结构上实现线性表操作LENGTH&#xff08;L&#xff09;。 #define _CRT_SECURE_NO_WARNINGS #include<stdio.h> #include<stdbool.h> #include<malloc.h> //单链表定义 //链表结点 int A[10] { 1,2,3,4,5,6,…

windows系统玩游戏找不到d3dx9_35.dll缺失的解决方法

分享一个我们在打开游戏或许软件过程中遇到的问题——“由于找不到d3dx9_35.dll,无法继续执行代码”的五个修复方案。这个问题可能会影响到我们的工作和娱乐效率&#xff0c;甚至可能导致工作的延期。因此&#xff0c;我希望通过今天的文章&#xff0c;能够帮助大家更好地解决这…