Skip to content

作为多智能体模拟的聊天机器人评估

在构建聊天机器人(如客户支持助手)时,可能很难正确评估机器人的性能。每次代码更改后都要手动与它进行大量交互,这非常耗时。

使评估过程更轻松、更具可重复性的一种方法是模拟用户交互。

使用 LangGraph,很容易设置这个模拟。下面是一个如何创建“虚拟用户”来模拟对话的示例。

整体模拟大致如下:

图表

设置

首先,让我们安装所需的包并设置我们的 API 密钥

%%capture --no-stderr
%pip install -U langgraph langchain langchain_openai
import getpass
import os


def _set_if_undefined(var: str):
    if not os.environ.get(var):
        os.environ[var] = getpass.getpass(f"Please provide your {var}")


_set_if_undefined("OPENAI_API_KEY")

为 LangGraph 开发设置 LangSmith

注册 LangSmith 以快速发现问题并提升你的 LangGraph 项目的性能。LangSmith 允许你使用跟踪数据来调试、测试和监控使用 LangGraph 构建的大语言模型应用程序 — 点击 此处 了解更多关于如何开始使用的信息。

定义聊天机器人

接下来,我们将定义我们的聊天机器人。对于本笔记本,我们假设该机器人的 API 接受消息列表并回复一条消息。如果你想进行更新,你只需更改此部分以及下面模拟器中的 “get_messages_for_agent” 函数。

my_chat_bot 内的实现是可配置的,甚至可以在其他系统上运行(例如,如果你的系统不是用 Python 运行的)。

from typing import List

import openai


# This is flexible, but you can define your agent here, or call your agent API here.
def my_chat_bot(messages: List[dict]) -> dict:
    system_message = {
        "role": "system",
        "content": "You are a customer support agent for an airline.",
    }
    messages = [system_message] + messages
    completion = openai.chat.completions.create(
        messages=messages, model="gpt-3.5-turbo"
    )
    return completion.choices[0].message.model_dump()
my_chat_bot([{"role": "user", "content": "hi!"}])
{'content': 'Hello! How can I assist you today?',
 'role': 'assistant',
 'function_call': None,
 'tool_calls': None}

定义模拟用户

现在我们要定义模拟用户。 这个模拟用户可以是我们想要的任何形式,但我们将把它构建为一个 LangChain 机器人。

from langchain_core.prompts import ChatPromptTemplate, MessagesPlaceholder
from langchain_openai import ChatOpenAI

system_prompt_template = """You are a customer of an airline company. \
You are interacting with a user who is a customer support person. \

{instructions}

When you are finished with the conversation, respond with a single word 'FINISHED'"""

prompt = ChatPromptTemplate.from_messages(
    [
        ("system", system_prompt_template),
        MessagesPlaceholder(variable_name="messages"),
    ]
)
instructions = """Your name is Harrison. You are trying to get a refund for the trip you took to Alaska. \
You want them to give you ALL the money back. \
This trip happened 5 years ago."""

prompt = prompt.partial(name="Harrison", instructions=instructions)

model = ChatOpenAI()

simulated_user = prompt | model

API Reference: ChatPromptTemplate | MessagesPlaceholder

from langchain_core.messages import HumanMessage

messages = [HumanMessage(content="Hi! How can I help you?")]
simulated_user.invoke({"messages": messages})

API Reference: HumanMessage

AIMessage(content='Hi, I would like to request a refund for a trip I took with your airline company to Alaska. Is it possible to get a refund for that trip?')

定义代理模拟

以下代码创建了一个 LangGraph 工作流来运行模拟。主要组件包括:

  1. 两个节点:一个用于模拟用户,另一个用于聊天机器人。
  2. 图本身,带有条件停止准则。

有关更多信息,请阅读以下代码中的注释。

定义节点

首先,我们定义图中的节点。这些节点应接收一个消息列表,并返回一个要添加到状态中的消息列表。 这些将是我们上面提到的聊天机器人和模拟用户的封装。

注意: 这里有一个棘手的问题,即哪些消息属于哪一方。因为聊天机器人和我们的模拟用户都是大语言模型(LLM),它们都会以 AI 消息进行响应。我们的状态将是一个由人类消息和 AI 消息交替组成的列表。这意味着对于其中一个节点,需要有一些逻辑来切换 AI 和人类的角色。在这个示例中,我们将假设人类消息是来自模拟用户的消息。这意味着我们需要在模拟用户节点中添加一些逻辑来交换 AI 和人类消息。

首先,让我们定义聊天机器人节点

from langchain_community.adapters.openai import convert_message_to_dict
from langchain_core.messages import AIMessage


def chat_bot_node(state):
    messages = state["messages"]
    # Convert from LangChain format to the OpenAI format, which our chatbot function expects.
    messages = [convert_message_to_dict(m) for m in messages]
    # Call the chat bot
    chat_bot_response = my_chat_bot(messages)
    # Respond with an AI Message
    return {"messages": [AIMessage(content=chat_bot_response["content"])]}

API Reference: convert_message_to_dict | AIMessage

接下来,让我们为模拟用户定义节点。这将涉及一些逻辑来交换消息的角色。

def _swap_roles(messages):
    new_messages = []
    for m in messages:
        if isinstance(m, AIMessage):
            new_messages.append(HumanMessage(content=m.content))
        else:
            new_messages.append(AIMessage(content=m.content))
    return new_messages


def simulated_user_node(state):
    messages = state["messages"]
    # Swap roles of messages
    new_messages = _swap_roles(messages)
    # Call the simulated user
    response = simulated_user.invoke({"messages": new_messages})
    # This response is an AI message - we need to flip this to be a human message
    return {"messages": [HumanMessage(content=response.content)]}

定义边

现在我们需要为边定义逻辑。主要逻辑在模拟用户发言之后出现,它应该会导向以下两种结果之一:

  • 要么我们继续并调用客户支持机器人
  • 要么我们结束对话

那么,对话结束的逻辑是什么呢?我们将其定义为人类聊天机器人回复 FINISHED(参见系统提示),或者对话长度超过 6 条消息(这是一个随意设定的数字,只是为了让这个示例简洁一些)。

def should_continue(state):
    messages = state["messages"]
    if len(messages) > 6:
        return "end"
    elif messages[-1].content == "FINISHED":
        return "end"
    else:
        return "continue"

定义图

现在我们可以定义用于设置模拟的图了!

from langgraph.graph import END, StateGraph, START
from langgraph.graph.message import add_messages
from typing import Annotated
from typing_extensions import TypedDict


class State(TypedDict):
    messages: Annotated[list, add_messages]


graph_builder = StateGraph(State)
graph_builder.add_node("user", simulated_user_node)
graph_builder.add_node("chat_bot", chat_bot_node)
# Every response from  your chat bot will automatically go to the
# simulated user
graph_builder.add_edge("chat_bot", "user")
graph_builder.add_conditional_edges(
    "user",
    should_continue,
    # If the finish criteria are met, we will stop the simulation,
    # otherwise, the virtual user's message will be sent to your chat bot
    {
        "end": END,
        "continue": "chat_bot",
    },
)
# The input will first go to your chat bot
graph_builder.add_edge(START, "chat_bot")
simulation = graph_builder.compile()

运行模拟

现在我们可以评估我们的聊天机器人了!我们可以用空消息调用它(这将模拟让聊天机器人开启初始对话)

for chunk in simulation.stream({"messages": []}):
    # Print out all events aside from the final end chunk
    if END not in chunk:
        print(chunk)
        print("----")
{'chat_bot': AIMessage(content='How may I assist you today regarding your flight or any other concerns?')}
----
{'user': HumanMessage(content='Hi, my name is Harrison. I am reaching out to request a refund for a trip I took to Alaska with your airline company. The trip occurred about 5 years ago. I would like to receive a refund for the entire amount I paid for the trip. Can you please assist me with this?')}
----
{'chat_bot': AIMessage(content="Hello, Harrison. Thank you for reaching out to us. I understand you would like to request a refund for a trip you took to Alaska five years ago. I'm afraid that our refund policy typically has a specific timeframe within which refund requests must be made. Generally, refund requests need to be submitted within 24 to 48 hours after the booking is made, or in certain cases, within a specified cancellation period.\n\nHowever, I will do my best to assist you. Could you please provide me with some additional information? Can you recall any specific details about the booking, such as the flight dates, booking reference or confirmation number? This will help me further look into the possibility of processing a refund for you.")}
----
{'user': HumanMessage(content="Hello, thank you for your response. I apologize for not requesting the refund earlier. Unfortunately, I don't have the specific details such as the flight dates, booking reference, or confirmation number at the moment. Is there any other way we can proceed with the refund request without these specific details? I would greatly appreciate your assistance in finding a solution.")}
----
{'chat_bot': AIMessage(content="I understand the situation, Harrison. Without specific details like flight dates, booking reference, or confirmation number, it becomes challenging to locate and process the refund accurately. However, I can still try to help you.\n\nTo proceed further, could you please provide me with any additional information you might remember? This could include the approximate date of travel, the departure and arrival airports, the names of the passengers, or any other relevant details related to the booking. The more information you can provide, the better we can investigate the possibility of processing a refund for you.\n\nAdditionally, do you happen to have any documentation related to your trip, such as receipts, boarding passes, or emails from our airline? These documents could assist in verifying your trip and processing the refund request.\n\nI apologize for any inconvenience caused, and I'll do my best to assist you further based on the information you can provide.")}
----
{'user': HumanMessage(content="I apologize for the inconvenience caused. Unfortunately, I don't have any additional information or documentation related to the trip. It seems that I am unable to provide you with the necessary details to process the refund request. I understand that this may limit your ability to assist me further, but I appreciate your efforts in trying to help. Thank you for your time. \n\nFINISHED")}
----
{'chat_bot': AIMessage(content="I understand, Harrison. I apologize for any inconvenience caused, and I appreciate your understanding. If you happen to locate any additional information or documentation in the future, please don't hesitate to reach out to us again. Our team will be more than happy to assist you with your refund request or any other travel-related inquiries. Thank you for contacting us, and have a great day!")}
----
{'user': HumanMessage(content='FINISHED')}
----

Comments