如何处理大量的工具¶
可供调用的工具子集通常由模型自行决定(尽管许多提供者也允许用户指定或限制工具的选择)。随着可用工具数量的增长,您可能希望限制LLM选择的范围,以减少token消耗并帮助管理LLM推理中的错误来源。
在这里,我们将演示如何动态调整提供给模型的工具。简而言之:就像RAG和其他类似方法一样,我们在调用模型之前通过检索可用工具来进行前缀操作。尽管我们演示了一种搜索工具描述的实现方法,但工具选择的细节可以根据需要进行定制。
环境搭建¶
首先,让我们安装所需的包并设置API密钥
import getpass
import os
def _set_env(var: str):
if not os.environ.get(var):
os.environ[var] = getpass.getpass(f"{var}: ")
_set_env("OPENAI_API_KEY")
使用LangSmith进行LangGraph开发
注册LangSmith,可以快速发现并解决您的LangGraph项目中的问题,提高项目性能。LangSmith允许您使用跟踪数据来调试、测试和监控使用LangGraph构建的LLM应用程序——更多关于如何开始的信息,请参阅这里。
定义工具¶
让我们考虑一个示例,其中我们为每个在S&P 500指数中的上市公司都提供了一个工具。每个工具根据提供的年份参数获取特定公司的信息。
我们首先构建一个注册表,该注册表将唯一的标识符与每个工具的模式关联起来。我们将使用JSON模式来表示这些工具,这些模式可以直接绑定到支持工具调用的聊天模型上。
import re
import uuid
from langchain_core.tools import StructuredTool
def create_tool(company: str) -> dict:
"""Create schema for a placeholder tool."""
# Remove non-alphanumeric characters and replace spaces with underscores for the tool name
formatted_company = re.sub(r"[^\w\s]", "", company).replace(" ", "_")
def company_tool(year: int) -> str:
# Placeholder function returning static revenue information for the company and year
return f"{company} had revenues of $100 in {year}."
return StructuredTool.from_function(
company_tool,
name=formatted_company,
description=f"Information about {company}",
)
# Abbreviated list of S&P 500 companies for demonstration
s_and_p_500_companies = [
"3M",
"A.O. Smith",
"Abbott",
"Accenture",
"Advanced Micro Devices",
"Yum! Brands",
"Zebra Technologies",
"Zimmer Biomet",
"Zoetis",
]
# Create a tool for each company and store it in a registry with a unique UUID as the key
tool_registry = {
str(uuid.uuid4()): create_tool(company) for company in s_and_p_500_companies
}
API Reference: StructuredTool
定义图¶
工具选择¶
我们将构建一个节点,该节点根据状态中的信息(如最近的用户消息)检索可用工具的一个子集。一般来说,检索解决方案的全部范围都可用于此步骤。作为简单解决方案,我们将工具描述的嵌入索引存储在向量存储中,并通过语义搜索将用户查询与工具关联起来。
from langchain_core.documents import Document
from langchain_core.vectorstores import InMemoryVectorStore
from langchain_openai import OpenAIEmbeddings
tool_documents = [
Document(
page_content=tool.description,
id=id,
metadata={"tool_name": tool.name},
)
for id, tool in tool_registry.items()
]
vector_store = InMemoryVectorStore(embedding=OpenAIEmbeddings())
document_ids = vector_store.add_documents(tool_documents)
API Reference: Document | InMemoryVectorStore | OpenAIEmbeddings
与代理集成¶
我们将使用一个典型的React代理图(例如,在快速入门中使用),并进行一些修改:
- 我们在状态中添加了一个
selected_tools
键,用于存储我们选择的工具子集; - 我们将图的入口点设置为一个
select_tools
节点,该节点填充状态的这一部分; - 我们在
agent
节点中将选定的工具子集绑定到聊天模型。
from typing import Annotated
from langchain_openai import ChatOpenAI
from typing_extensions import TypedDict
from langgraph.graph import StateGraph, START
from langgraph.graph.message import add_messages
from langgraph.prebuilt import ToolNode, tools_condition
# Define the state structure using TypedDict.
# It includes a list of messages (processed by add_messages)
# and a list of selected tool IDs.
class State(TypedDict):
messages: Annotated[list, add_messages]
selected_tools: list[str]
builder = StateGraph(State)
# Retrieve all available tools from the tool registry.
tools = list(tool_registry.values())
llm = ChatOpenAI()
# The agent function processes the current state
# by binding selected tools to the LLM.
def agent(state: State):
# Map tool IDs to actual tools
# based on the state's selected_tools list.
selected_tools = [tool_registry[id] for id in state["selected_tools"]]
# Bind the selected tools to the LLM for the current interaction.
llm_with_tools = llm.bind_tools(selected_tools)
# Invoke the LLM with the current messages and return the updated message list.
return {"messages": [llm_with_tools.invoke(state["messages"])]}
# The select_tools function selects tools based on the user's last message content.
def select_tools(state: State):
last_user_message = state["messages"][-1]
query = last_user_message.content
tool_documents = vector_store.similarity_search(query)
return {"selected_tools": [document.id for document in tool_documents]}
builder.add_node("agent", agent)
builder.add_node("select_tools", select_tools)
tool_node = ToolNode(tools=tools)
builder.add_node("tools", tool_node)
builder.add_conditional_edges("agent", tools_condition, path_map=["tools", "__end__"])
builder.add_edge("tools", "agent")
builder.add_edge("select_tools", "agent")
builder.add_edge(START, "select_tools")
graph = builder.compile()
API Reference: ChatOpenAI | StateGraph | START | add_messages | ToolNode | tools_condition
from IPython.display import Image, display
try:
display(Image(graph.get_graph().draw_mermaid_png()))
except Exception:
# This requires some extra dependencies and is optional
pass
user_input = "Can you give me some information about AMD in 2022?"
result = graph.invoke({"messages": [("user", user_input)]})
['ab9c0d59-3d16-448d-910c-73cf10a26020', 'f5eff8f6-7fb9-47b6-b54f-19872a52db84', '2962e168-9ef4-48dc-8b7c-9227e7956d39', '24a9fb82-19fe-4a88-944e-47bc4032e94a']
================================[1m Human Message [0m=================================
Can you give me some information about AMD in 2022?
==================================[1m Ai Message [0m==================================
Tool Calls:
Advanced_Micro_Devices (call_CRxQ0oT7NY7lqf35DaRNTJ35)
Call ID: call_CRxQ0oT7NY7lqf35DaRNTJ35
Args:
year: 2022
=================================[1m Tool Message [0m=================================
Name: Advanced_Micro_Devices
Advanced Micro Devices had revenues of $100 in 2022.
==================================[1m Ai Message [0m==================================
In 2022, Advanced Micro Devices (AMD) had revenues of $100.
工具选择重复¶
为了管理由于错误工具选择导致的错误,我们可以重新审视select_tools
节点。实现这一功能的一种方法是修改select_tools
以使用状态中的所有消息(例如,使用聊天模型)生成向量存储查询,并从tools
添加一条边到select_tools
。
我们下面实现了这一变化。为了演示目的,我们在select_tools
节点中添加了一个hack_remove_tool_condition
,在第一次迭代时移除正确的工具,从而模拟初始工具选择中的错误。请注意,在第二次迭代时,代理完成了运行,因为它可以访问正确的工具。
使用Pydantic与LangChain
本笔记本使用Pydantic v2 BaseModel
,需要langchain-core >= 0.3
。使用langchain-core < 0.3
将导致错误,因为混合使用了Pydantic v1和v2的BaseModels
。
from langchain_core.messages import HumanMessage, SystemMessage, ToolMessage
from langgraph.pregel.retry import RetryPolicy
from pydantic import BaseModel, Field
class QueryForTools(BaseModel):
"""Generate a query for additional tools."""
query: str = Field(..., description="Query for additional tools.")
def select_tools(state: State):
"""Selects tools based on the last message in the conversation state.
If the last message is from a human, directly uses the content of the message
as the query. Otherwise, constructs a query using a system message and invokes
the LLM to generate tool suggestions.
"""
last_message = state["messages"][-1]
hack_remove_tool_condition = False # Simulate an error in the first tool selection
if isinstance(last_message, HumanMessage):
query = last_message.content
hack_remove_tool_condition = True # Simulate wrong tool selection
else:
assert isinstance(last_message, ToolMessage)
system = SystemMessage(
"Given this conversation, generate a query for additional tools. "
"The query should be a short string containing what type of information "
"is needed. If no further information is needed, "
"set more_information_needed False and populate a blank string for the query."
)
input_messages = [system] + state["messages"]
response = llm.bind_tools([QueryForTools], tool_choice=True).invoke(
input_messages
)
query = response.tool_calls[0]["args"]["query"]
# Search the tool vector store using the generated query
tool_documents = vector_store.similarity_search(query)
if hack_remove_tool_condition:
# Simulate error by removing the correct tool from the selection
selected_tools = [
document.id
for document in tool_documents
if document.metadata["tool_name"] != "Advanced_Micro_Devices"
]
else:
selected_tools = [document.id for document in tool_documents]
return {"selected_tools": selected_tools}
graph_builder = StateGraph(State)
graph_builder.add_node("agent", agent)
graph_builder.add_node("select_tools", select_tools, retry=RetryPolicy(max_attempts=3))
tool_node = ToolNode(tools=tools)
graph_builder.add_node("tools", tool_node)
graph_builder.add_conditional_edges(
"agent",
tools_condition,
)
graph_builder.add_edge("tools", "select_tools")
graph_builder.add_edge("select_tools", "agent")
graph_builder.add_edge(START, "select_tools")
graph = graph_builder.compile()
API Reference: HumanMessage | SystemMessage | ToolMessage
from IPython.display import Image, display
try:
display(Image(graph.get_graph().draw_mermaid_png()))
except Exception:
# This requires some extra dependencies and is optional
pass
user_input = "Can you give me some information about AMD in 2022?"
result = graph.invoke({"messages": [("user", user_input)]})
================================[1m Human Message [0m=================================
Can you give me some information about AMD in 2022?
==================================[1m Ai Message [0m==================================
Tool Calls:
Accenture (call_qGmwFnENwwzHOYJXiCAaY5Mx)
Call ID: call_qGmwFnENwwzHOYJXiCAaY5Mx
Args:
year: 2022
=================================[1m Tool Message [0m=================================
Name: Accenture
Accenture had revenues of $100 in 2022.
==================================[1m Ai Message [0m==================================
Tool Calls:
Advanced_Micro_Devices (call_u9e5UIJtiieXVYi7Y9GgyDpn)
Call ID: call_u9e5UIJtiieXVYi7Y9GgyDpn
Args:
year: 2022
=================================[1m Tool Message [0m=================================
Name: Advanced_Micro_Devices
Advanced Micro Devices had revenues of $100 in 2022.
==================================[1m Ai Message [0m==================================
In 2022, AMD had revenues of $100.
下一步¶
本指南提供了一个动态选择工具的最小实现。还有很多可能的改进和优化:
- 重复工具选择:这里,我们通过修改
select_tools
节点来重复工具选择。另一种选择是让代理装备一个reselect_tools
工具,使其能够根据自己的意愿重新选择工具。 - 优化工具选择:总体而言,检索解决方案的所有范围都可用于工具选择。其他选项包括:
- 分组工具并按组检索;
- 使用聊天模型选择工具或工具组。