如何处理大量工具¶

前提条件

本指南假定您熟悉以下内容：

可调用的工具子集通常由模型自行决定（尽管许多提供者也允许用户指定或限制工具的选择）。随着可用工具数量的增长，您可能需要限制LLM选择的范围，以减少token消耗并帮助管理LLM推理中的错误来源。

在这里，我们将演示如何动态调整模型可用的工具。直接说重点：与RAG和类似方法一样，我们在调用模型之前通过检索可用工具来调整。虽然我们展示了一个基于工具描述进行搜索的实现方式，但工具选择的具体细节可以根据需要进行自定义。

设置¶

首先，让我们安装所需的包并设置我们的API密钥

pip install --quiet -U langgraph langchain_openai numpy

import getpass
import os


def _set_env(var: str):
    if not os.environ.get(var):
        os.environ[var] = getpass.getpass(f"{var}: ")


_set_env("OPENAI_API_KEY")

为 LangGraph 开发设置 LangSmith

注册 LangSmith 以快速发现并改进您的 LangGraph 项目的问题。LangSmith 允许您使用追踪数据来调试、测试和监控使用 LangGraph 构建的 LLM 应用 —— 有关如何入门的更多信息，请阅读此处。

定义工具¶

让我们考虑一个玩具示例，其中我们为S&P 500指数中每一家上市公司都提供一个工具。每个工具会根据作为参数提供的年份来获取特定公司的信息。

我们首先构建一个注册表，将每个工具的唯一标识符与它的模式相关联。我们将使用JSON schema来表示这些工具，它可以直接绑定到支持工具调用的聊天模型。

^{API Reference: StructuredTool}

import re
import uuid

from langchain_core.tools import StructuredTool


def create_tool(company: str) -> dict:
    """Create schema for a placeholder tool."""
    # Remove non-alphanumeric characters and replace spaces with underscores for the tool name
    formatted_company = re.sub(r"[^\w\s]", "", company).replace(" ", "_")

    def company_tool(year: int) -> str:
        # Placeholder function returning static revenue information for the company and year
        return f"{company} had revenues of $100 in {year}."

    return StructuredTool.from_function(
        company_tool,
        name=formatted_company,
        description=f"Information about {company}",
    )


# Abbreviated list of S&P 500 companies for demonstration
s_and_p_500_companies = [
    "3M",
    "A.O. Smith",
    "Abbott",
    "Accenture",
    "Advanced Micro Devices",
    "Yum! Brands",
    "Zebra Technologies",
    "Zimmer Biomet",
    "Zoetis",
]

# Create a tool for each company and store it in a registry with a unique UUID as the key
tool_registry = {
    str(uuid.uuid4()): create_tool(company) for company in s_and_p_500_companies
}

定义图¶

工具选择¶

我们将构建一个节点，根据状态中的信息（例如最近的用户消息）检索可用工具的子集。一般来说，此步骤可以使用完整的检索解决方案。作为一个简单的解决方案，我们在向量存储中对工具描述进行嵌入索引，并通过语义搜索将用户查询关联到工具。

^{API Reference: Document | InMemoryVectorStore | OpenAIEmbeddings}

from langchain_core.documents import Document
from langchain_core.vectorstores import InMemoryVectorStore
from langchain_openai import OpenAIEmbeddings

tool_documents = [
    Document(
        page_content=tool.description,
        id=id,
        metadata={"tool_name": tool.name},
    )
    for id, tool in tool_registry.items()
]

vector_store = InMemoryVectorStore(embedding=OpenAIEmbeddings())
document_ids = vector_store.add_documents(tool_documents)

与代理集成¶

我们将使用一个典型的 React 代理图（例如，在快速入门中使用的），并进行一些修改：

我们在状态中添加了一个 selected_tools 键，用于存储我们选择的工具子集；
我们将图的入口点设置为一个 select_tools 节点，该节点填充此状态元素；
我们在 agent 节点内将所选工具子集绑定到聊天模型。

^{API Reference: ChatOpenAI | StateGraph | START | add_messages | ToolNode | tools_condition}

from typing import Annotated

from langchain_openai import ChatOpenAI
from typing_extensions import TypedDict

from langgraph.graph import StateGraph, START
from langgraph.graph.message import add_messages
from langgraph.prebuilt import ToolNode, tools_condition


# Define the state structure using TypedDict.
# It includes a list of messages (processed by add_messages)
# and a list of selected tool IDs.
class State(TypedDict):
    messages: Annotated[list, add_messages]
    selected_tools: list[str]


builder = StateGraph(State)

# Retrieve all available tools from the tool registry.
tools = list(tool_registry.values())
llm = ChatOpenAI()


# The agent function processes the current state
# by binding selected tools to the LLM.
def agent(state: State):
    # Map tool IDs to actual tools
    # based on the state's selected_tools list.
    selected_tools = [tool_registry[id] for id in state["selected_tools"]]
    # Bind the selected tools to the LLM for the current interaction.
    llm_with_tools = llm.bind_tools(selected_tools)
    # Invoke the LLM with the current messages and return the updated message list.
    return {"messages": [llm_with_tools.invoke(state["messages"])]}


# The select_tools function selects tools based on the user's last message content.
def select_tools(state: State):
    last_user_message = state["messages"][-1]
    query = last_user_message.content
    tool_documents = vector_store.similarity_search(query)
    return {"selected_tools": [document.id for document in tool_documents]}


builder.add_node("agent", agent)
builder.add_node("select_tools", select_tools)

tool_node = ToolNode(tools=tools)
builder.add_node("tools", tool_node)

builder.add_conditional_edges("agent", tools_condition, path_map=["tools", "__end__"])
builder.add_edge("tools", "agent")
builder.add_edge("select_tools", "agent")
builder.add_edge(START, "select_tools")
graph = builder.compile()

from IPython.display import Image, display

try:
    display(Image(graph.get_graph().draw_mermaid_png()))
except Exception:
    # This requires some extra dependencies and is optional
    pass

user_input = "Can you give me some information about AMD in 2022?"

result = graph.invoke({"messages": [("user", user_input)]})

print(result["selected_tools"])

['ab9c0d59-3d16-448d-910c-73cf10a26020', 'f5eff8f6-7fb9-47b6-b54f-19872a52db84', '2962e168-9ef4-48dc-8b7c-9227e7956d39', '24a9fb82-19fe-4a88-944e-47bc4032e94a']

for message in result["messages"]:
    message.pretty_print()

================================ Human Message =================================

Can you give me some information about AMD in 2022?
================================== Ai Message ==================================
Tool Calls:
  Advanced_Micro_Devices (call_CRxQ0oT7NY7lqf35DaRNTJ35)
 Call ID: call_CRxQ0oT7NY7lqf35DaRNTJ35
  Args:
    year: 2022
================================= Tool Message =================================
Name: Advanced_Micro_Devices

Advanced Micro Devices had revenues of $100 in 2022.
================================== Ai Message ==================================

In 2022, Advanced Micro Devices (AMD) had revenues of $100.

重复选择工具¶

为了处理由于工具选择错误导致的错误，我们可以重新审视 select_tools 节点。实现此功能的一种方法是修改 select_tools，使其使用状态中的所有消息生成向量存储查询（例如，使用聊天模型），并从 tools 添加一条边路由到 select_tools。

我们在下面实现了这一更改。为了演示目的，我们通过在 select_tools 节点中添加一个 hack_remove_tool_condition 来模拟初始工具选择中的错误，该条件会在节点第一次迭代时移除正确的工具。请注意，在第二次迭代时，代理可以访问正确的工具，因此会完成运行。

使用 Pydantic 与 LangChain

此笔记本使用 Pydantic v2 的 BaseModel，需要 langchain-core >= 0.3。使用 langchain-core < 0.3 会导致错误，因为混合了 Pydantic v1 和 v2 的 BaseModels。

^{API Reference: HumanMessage | SystemMessage | ToolMessage}

from langchain_core.messages import HumanMessage, SystemMessage, ToolMessage
from langgraph.pregel.retry import RetryPolicy

from pydantic import BaseModel, Field


class QueryForTools(BaseModel):
    """Generate a query for additional tools."""

    query: str = Field(..., description="Query for additional tools.")


def select_tools(state: State):
    """Selects tools based on the last message in the conversation state.

    If the last message is from a human, directly uses the content of the message
    as the query. Otherwise, constructs a query using a system message and invokes
    the LLM to generate tool suggestions.
    """
    last_message = state["messages"][-1]
    hack_remove_tool_condition = False  # Simulate an error in the first tool selection

    if isinstance(last_message, HumanMessage):
        query = last_message.content
        hack_remove_tool_condition = True  # Simulate wrong tool selection
    else:
        assert isinstance(last_message, ToolMessage)
        system = SystemMessage(
            "Given this conversation, generate a query for additional tools. "
            "The query should be a short string containing what type of information "
            "is needed. If no further information is needed, "
            "set more_information_needed False and populate a blank string for the query."
        )
        input_messages = [system] + state["messages"]
        response = llm.bind_tools([QueryForTools], tool_choice=True).invoke(
            input_messages
        )
        query = response.tool_calls[0]["args"]["query"]

    # Search the tool vector store using the generated query
    tool_documents = vector_store.similarity_search(query)
    if hack_remove_tool_condition:
        # Simulate error by removing the correct tool from the selection
        selected_tools = [
            document.id
            for document in tool_documents
            if document.metadata["tool_name"] != "Advanced_Micro_Devices"
        ]
    else:
        selected_tools = [document.id for document in tool_documents]
    return {"selected_tools": selected_tools}


graph_builder = StateGraph(State)
graph_builder.add_node("agent", agent)
graph_builder.add_node(
    "select_tools", select_tools, retry_policy=RetryPolicy(max_attempts=3)
)

tool_node = ToolNode(tools=tools)
graph_builder.add_node("tools", tool_node)

graph_builder.add_conditional_edges(
    "agent",
    tools_condition,
)
graph_builder.add_edge("tools", "select_tools")
graph_builder.add_edge("select_tools", "agent")
graph_builder.add_edge(START, "select_tools")
graph = graph_builder.compile()

from IPython.display import Image, display

try:
    display(Image(graph.get_graph().draw_mermaid_png()))
except Exception:
    # This requires some extra dependencies and is optional
    pass

user_input = "Can you give me some information about AMD in 2022?"

result = graph.invoke({"messages": [("user", user_input)]})

for message in result["messages"]:
    message.pretty_print()

================================ Human Message =================================

Can you give me some information about AMD in 2022?
================================== Ai Message ==================================
Tool Calls:
  Accenture (call_qGmwFnENwwzHOYJXiCAaY5Mx)
 Call ID: call_qGmwFnENwwzHOYJXiCAaY5Mx
  Args:
    year: 2022
================================= Tool Message =================================
Name: Accenture

Accenture had revenues of $100 in 2022.
================================== Ai Message ==================================
Tool Calls:
  Advanced_Micro_Devices (call_u9e5UIJtiieXVYi7Y9GgyDpn)
 Call ID: call_u9e5UIJtiieXVYi7Y9GgyDpn
  Args:
    year: 2022
================================= Tool Message =================================
Name: Advanced_Micro_Devices

Advanced Micro Devices had revenues of $100 in 2022.
================================== Ai Message ==================================

In 2022, AMD had revenues of $100.

下一步¶

本指南提供了一个用于动态选择工具的最小实现。还有许多可能的改进和优化方式：

重复选择工具：在这里，我们通过修改 select_tools 节点来实现重复选择工具。另一种方法是为代理配备一个 reselect_tools 工具，使其可以自行决定重新选择工具。
优化工具选择：一般来说，完整的检索解决方案可用于工具选择。其他选项包括：
将工具分组，并在分组中进行检索；
使用聊天模型来选择工具或工具组。