Agent-资讯中心-人力资源外包管理系统-智仁HRO

LangChain Agent：创建你自己的Multi-action Agent

2024.03.10

1. 回顾 1.1 什么是LangChain Agent？自主Agent是无需人工干预即可做出决策并采取行动的系统，但它们可能会表现出不可预测的行为。为了解决这个问题，Agent了解他们可以做什么（操作）非常重要。在LangChain的ReAct型Agent中，Agent可以采取的行动被写在提示中，并根据提示做出决策和行动。这使您可以对Agent的行为进行一定程度的控制。 1.2 任务的生成用户提出的问题并不总是简单的，但通常很复杂，需要拆分成多项任务。上篇文章，我们解释了如何将用户的问题传递给 LLM、解释它们并将其转换为特定的任务。在这个任务生成过程中，Agent在生成任务的同时了解自己可能采取的行动非常重要。 2. Multi-action Agent 任务生成后，生成的任务（单个或多个）实际上被放入Agent中，但通常使用的LangChain的LLMSingleActionAgent无法处理多个任务。必须创建自己的Multi-action Agent来处理多个任务并将它们组合成一个答案。接下来就讲解一下如何使用LangChain创建自己的Multi-action Agent！ 2.1 创建Multi-action Agent 现在，让我们看看如何创建Multi-action Agent。这次，我们将参考LangChain的BaseMultiActionAgent类来创建一个自定义代理。from typing import List, Tuple, Any, Union, Dict, Optionalfrom langchain.chat_models import ChatOpenAIfrom langchain.schema import AgentAction, AgentFinishfrom langchain.agents import Tool, BaseMultiActionAgent, AgentOutputParserfrom langchain.chains import LLMChainfrom langchain.callbacks.manager import Callbacksfrom TaskListCreation import TaskListCreationclass LLMMultiActionAgent(BaseMultiActionAgent): """Base class for multi action agents using LLMChain.""" llm_chain: LLMChain output_parser: AgentOutputParser stop: List[str] def __init__(self, *args, **kwargs): super().__init__(*args, **kwargs) self.task_list_creation = TaskListCreation() @property def input_keys(self) -> List[str]: """Return the input keys.""" return list(set(self.llm_chain.input_keys) - {"intermediate_steps"}) def dict(self, **kwargs: Any) -> Dict: """Return dictionary representation of agent.""" _dict = super().dict() del _dict["output_parser"] return _dict def plan( self, intermediate_steps: List[Tuple[AgentAction, str]], callbacks: Callbacks = None, **kwargs: Any ) -> Union[List[AgentAction], AgentFinish]: """Given input, decide what to do. Args: intermediate_steps: Steps taken to date, along with observations callbacks: Callbacks to run. **kwargs: User inputs. Returns: Action specifying what tool to use or finish action. """ latest_user_message = kwargs.get("input", "") if len(intermediate_steps) == 0: task_list = self.task_list_creation.create_task_list(latest_user_message) else: task_list = [] actions = [] final_outputs = [obs for _, obs in intermediate_steps] kwargs.pop("input", None) for task in task_list: output = self.llm_chain.run( intermediate_steps=intermediate_steps, stop=self.stop, input=task, **kwargs ) response = self.output_parser.parse(output) if isinstance(response, AgentAction): for tool in ALLOWED_TOOLS: if tool in response.tool: response.tool = tool break action = AgentAction(tool=response.tool, tool_input=response.tool_input, log=response.log) actions.append(action) elif not isinstance(response, AgentFinish): return AgentFinish(return_values={"output": "Unexpected response type."}, log="") if actions: return actions return AgentFinish(return_values={"output": final_outputs}, log="\n\nAll questions answered.") def tool_run_logging_kwargs(self) -> Dict: return { "llm_prefix": "", "observation_prefix": "" if len(self.stop) == 0 else self.stop[0], }请关注plan方法。这定义了实际的Agent执行流程，返回值是AgentAction 的列表或AgentFinish让我们按顺序看一下：1. 通过kwargs接收用户的问题2. intermediate_steps == 0时，即ReAct思维过程进入第一个循环时，生成执行任务。这里，我们暂且用TaskListCreation命名生成任务的类，并调用它。3. 对于每个生成的任务，运行 llm_chain 并将输出传递给 output_parser 以获取响应。4. 根据此响应定义并返回 AgentAction 或 AgentFinish这就是整个流程。响应包含[Question/Thought/Action/Action Input]一系列思考结果。 LangChain的agent本质上是循环思考，直到得到最终答案，并且intermediate_steps每次都会递增。但这一次，我尝试在第一轮之后停止思考循环。 2.2 AgentExecutor 简单解释一下 AgentExecutor 的作用，_call在方法内部，_take_next_step被调用、AgentAction执行并Observation获取任务的答案。多任务代理需要多加小心，因为有多个任务。如果所有生成的任务都能独立处理是没有问题的，但是如果任务之间存在依赖关系怎么办？例如，如果用户提出诸如“比较 A 公司的产品 X 和 B 公司的产品 Y 的功能”之类的问题，则会生成以下任务。1. 调查A公司产品X的特点2. 调查B公司产品Y的特点3. 比较产品 X 和产品 Y 的功能如果我们独立地对待这些，第三个任务会发生什么？在不了解产品 X 和产品 Y 的情况下，您无法比较它们。换句话说，只有在收到前两个任务的答案后，第三个任务才变得有意义。AgentExecutor 最初的设计并不是基于这样的假设，因此在这种情况下，有必要将任务之间的答案正确地传递到下一个任务。您需要将此类“内存接管”添加到您的 AgentExecutor 中。例如，您可以通过执行以下操作来解决此问题： for agent_action in actions: current_memory = self.local_memory.load_memory_variables({}) if current_memory['history']: memory_msg = "执行Action时请参考以下对话历史记录。\n对话历史记录：\n" + current_memory['history'] agent_action.tool_input += memory_msg 2.3 aplan方法如果任务之间不存在依赖关系，我们建议并行处理任务以加快进程。这种情况下，就需要定义前面提到的LLMtMultiActionAgent的aplan。例如，您可以包含如下代码片段：import asyncioasync def run_task(task): output = await self.llm_chain.arun( intermediate_steps=intermediate_steps, stop=self.stop, input=task, **kwargs ) return self.output_parser.parse(output)tasks = [run_task(task) for task in task_list]results = await asyncio.gather(*tasks)使用 arun 而不是 run 来获取所有任务的一组思考结果，然后使用 asyncio.gather 同时运行多个异步任务。通过这样进行并行处理，即使有多个任务，也可以像只有一个任务时一样快速获得任务的答案。 2.4 Agent实例最后，如下所示定义代理的实例并run使用命令运行代理。当发生特定于代理的解析错误时，我认为最好适当地使用 try/ except 进行响应。llm_chain = LLMChain(llm=self.LLM, prompt=self.prompt_agent)tool_names = [tool.name for tool in tools]agent = LLMMultiActionAgent( llm_chain=llm_chain, output_parser=output_parser, stop=["\nObservation:"], allowed_tools=tool_names) self.agent_executor = CustomAgentExecutor.from_agent_and_tools( agent=agent, tools=tools, verbose=True, memory=self.memory, handle_parsing_errors=False)result = self.agent_executor.run(query) 3. 总结在本文中，我们详细解释了如何使用 LangChain 创建Multi-action Agent。通过开发自己的代理，您将能够理解复杂的用户查询，将其分解为适当的任务，并有效地处理它们。

大模型 ReAct 智仁HRO LangChain Agent

LangChain Agent ：从查询来生成任务

2024.01.06

1. 简介自治代理（Autonomous Agents）最近经历了显着的演变，自治代理指的是无需人工干预即可独立做出决策和采取行动的系统。另一方面，由于它们基于复杂的算法进行操作，因此它们的行为难以预测，这可能导致无法实现预期行为等问题。那么，到底是什么原因导致它不能正常工作呢？原因之一是Agent本身不了解自己能做什么。在LangChain的Agent中，基于ReAct式的思维，Agent可以采取什么行动（或者说可以使用什么工具）被提前写在提示中，并根据该提示做出决策和行动。这样，通过意识到Agent本身“预先知道它能做什么”，就可以在一定程度上控制Agent的行为。这篇文章我们将讨论如何从查询来生成任务。 2. 根据查询生成任务对于Agent来说，“提前知道他们能做什么”很重要。Agent本身根据 ReAct 思维过程顺序生成任务，但是除了Agent用的LLM之外，准备任务生成用的LLM对于控制Agengt的行为是非常有用的。例如，假设您被问到这样的问题：“请比较 A 公司产品 X 和 B 公司产品 Y 的功能。 ” 如果该工具有办法研究这些产品（例如通过搜索网络或使用专用数据库），1. 调查A公司产品X的特点2. 调查B公司产品Y的特点3. 比较产品 X 和产品 Y 的功能它可以分解生成以上三个任务。然而，如果你不知道自己可以采取什么行动（或者有哪些工具可用），你可能无法在你的LLM中采取行动，比如“拜访A公司并咨询负责人”或“给A公司负责人打电话。”否则，可能会创建无法完成的任务。写一个好的提示是关键，但是如何真正写出提示呢？下面是一个Prompt例子。TEMPLATE_CREATE_TASK_LIST = """对于用户问题“{user_question}”，请按照以下规则生成任务列表：1. 如果您的问题可以清楚地分解为多个独立的主题或元素，请将它们分成单独的问题。2. 如果问题中包含“那些”、“那个”等不清楚的指示词，请参考对话记录猜测指示词的内容。3. 如果问题已分解，请将分解后的问题作为任务列表返回。4. 如果问题无法分解，请按任务列表形式返回问题。对话历史：{历史}您可以采取四种操作来执行该任务：1. Retrieval_A：用于回答A公司产品时使用2. Retrieval_B：用于回答有关B公司产品的问题3. Comparator：用于比较不同产品4. Calculator：用于费用等数值计算以下是一些示例：示例 1：告诉我有关产品 X 和产品 Y 的信息。→1. 了解产品 X2.了解产品Y示例 2：如果您购买 1000 件，请估算产品 X 的价格。→1. 找出产品 X 每个地点的价格。2.计算购买1000件的价格""" 3. 执行 3.1 设置LLM和API KEY那么试着执行一下。首先，设置API KEY和LLM。这一次，我们还将定义ConversationBufferMemory，假设您在接受前一个问题时生成任务。import osimport openaifrom langchain.memory import ConversationBufferMemoryfrom langchain.chat_models import ChatOpenAIopenai.api_key = os.environ['OPENAI_API_KEY'] llm = ChatOpenAI( model_name='gpt-3.5-turbo', temperature=0) memory = ConversationBufferMemory() 3.2 生成任务接下来我们看看实际生成任务的create_task_list。该函数使用LLM解析用户的问题并将其转化为具体的任务。最终结果是生成的任务列表。from langchain.schema import AIMessage, HumanMessage, SystemMessagefrom typing import Listimport redef create_task_list(user_question: str) -> List[str]: """Use LLM to decompose a question and convert into tasks. Args: user_question: The user's question. Returns: A list of tasks derived from the user's question. """ content_message = TEMPLATE_CREATE_TASK_LIST.format( user_question=user_question, history=memory.load_memory_variables({})['history'], ) output = llm([HumanMessage(content=content_message)]) all_responses = output.content.splitlines() task_list = [] for line in all_responses: if re.match(r"^\d+\.", line): task = line[line.index('.')+2:].strip() if task != user_question: task_list.append(task) if len(task_list) == 0: task_list.append(user_question) return task_list使用前面介绍的TEMPLATE_CREATE_TASK_LIST 模板输入到 LLM来进行处理。使用正则表达式re.match（r“^\d+\.”，line）检查输出的每一行是否是编号列表的格式。如果适用，请将该行的文本添加到任务列表中。如果任务列表为空，则返回原始问题。 4. 尝试执行任务后我们来实际执行一下上面的函数。这次，我们提出的问题是，“请比较A公司的产品X和B公司的产品Y的费用，生成购买1000个便宜的，100个贵的产品的报价。”def main(): question = "请比较A公司的产品X和B公司的产品Y的费用，生成购买1000个便宜的，100个贵的产品的报价。" task_list = create_task_list(question) print(task_list)if __name__ == "__main__": main()结果如下所示。['Retrieval_A：查找 A 公司的产品 X','Retrieval_B：查找B公司的产品Y','Comparator：比较产品X和产品Y的价格','Calculator：计算购买1000个便宜产品时的报价','Calculator：计算购买100个贵的产品时的报价']在任务生成的时候，LLM自己“很好地理解了能做什么”，所以很好地告诉了我们使用哪个工具做什么。这样的多个任务按代理顺序执行，但是多个任务不能用通常使用的LLMSingleActionAgent来处理。现在需要开发自己的代理，如LLMMultiActionAgent。下一次，我们再来讲述通过把生成的任务交给所谓的Multi-action Agent来进行实现！

大模型 ReAct 智仁HRO LangChain Agent

利用LangChain Agent实现ReAct式思维

2023.10.27

1. 简介这次我们将讲解一下LangChain Agent及其基本操作。LangChain的Agent实现了ReAct式的思维，所以我先从ReAct的解释开始，然后看实际的代码！ 2. 关于ReAct 2.1 ReAct是什么？在解释 ReAct 之前，让我们花点时间思考一下人类的思维过程。当人类思考时，我们巧妙地将言语推理与行动结合起来。你有一种天生的自我调整、制定策略和记忆的能力。例如，在厨房做饭时，你可能会想，“我已经把所有的蔬菜都切好了，所以接下来我会烧水”，或者改变你的计划，“如果我没有盐，我“用酱油和胡椒粉代替吧”。我也会想，“我现在能做什么？”并采取行动寻找答案。通过这样正确地运用“行动”和“思考”，人类可以快速学习新事物，并在不熟悉的情况下做出决定和推理。那么LLM（大规模语言模型）会怎么样？随着技术的最新进步，大模型现在能够像人类一样思考并做出复杂的决策。其中最受关注的是“ReAct”。ReAct是一种结合“推理”和“行动”的方法，旨在让人们像人类一样思考和行动。ReAct 的框架允许大模型在整合周围环境信息的同时进行思考、计划和调整。这使您能够更加灵活、适应性更强，并做出相应的决策。ReAct的灵感来自于整合人类思维和行为，有望帮助大规模语言模型解决困难任务。 2.2 ReAct实现提示现在，我们来看看实现ReAct式思维的提示词模板是如何编写的。PREFIX = """Answer the following questions as best you can. You have access to the following tools:"""FORMAT_INSTRUCTIONS = """Use the following format:Question: the input question you must answerThought: you should always think about what to doAction: the action to take, should be one of [{tool_names}]Action Input: the input to the actionObservation: the result of the action...(this Thought/Action/Action Input/Observation can repeat N times)Thought: I now know the final answerFinal Answer: the final answer to the original input question"""SUFFIX = """Begin!Question: {input}Thought:{agent_scratchpad}"""经过Question, Thought, Action, Action Input, Observation, Thought的过程最终输出Final Answer。在某些情况下，Action/Action Input/Observation 过程可能会重复多次。这就是LangChain的Agent提示模板的基本结构。基于这个形式，您可以自由定制创建您自己的Agent。 3. 关于Agent 3.1 如何使用Agent 使用LangChain Agent最简单的方式就是调用initialize_agent。在这种情况下，返回值是AgentExecutor，因此您无需自己调用AgentExecutor即可得到答案。chat_agent = initialize_agent( tools, llm=LLM, agent = "zero-shot-react-description", verbose=True, system_message="ou are a kind assistant. Please answer in Chinese!",)question = 'Please tell me about Vegeta'result = chat_agent.run(question)print(result)这里，agent我们指的是代理类型，标准有多种代理类型。示例：ZERO_SHOT_REACT_DESCRIPTION、CONVERSATIONAL_REACT_DESCRIPTION、OPENAI_MULTI_FUNCTIONS、... 3.2 定制Agent 实际中在开发聊天机器人等应用程序时，总是有详细的要求，而且实际情况是需要进行一些定制。需要定制的东西一般是PromptTemplate、OutputParser、Agent、AgentExecutor。下面是如何调用自定义模块的示例代码。# CustomPromptTemplate self.prompt_agent = CustomPromptTemplate( template=TEMPLATE_AGENT, tools=tools, input_variables=["input", "intermediate_steps", "history"] )# Custom Output Parser output_parser = CustomOutputParser()# Custom Agent llm_chain = LLMChain(llm=self.LLM, prompt=self.prompt_agent) tool_names = [tool.name for tool in tools] agent = CustomAgent( llm_chain=llm_chain, output_parser=output_parser, stop=["\nObservation:"], allowed_tools=tool_names )# CustomAgent Executor self.agent_executor = CustomAgentExecutor.from_agent_and_tools( agent=agent, tools=tools, verbose=True, memory=self.memory, handle_parsing_errors=False ) 3.3 LLMSingleActionAgent下面我们通过LLMSingleActionAgent来对自定义Agent的作用做个了解。class LLMSingleActionAgent(BaseSingleActionAgent): """Base class for single action agents.""" llm_chain: LLMChain """LLMChain to use for agent.""" output_parser: AgentOutputParser """Output parser to use for agent.""" stop: List[str] """List of strings to stop on.""" def plan( self, intermediate_steps: List[Tuple[AgentAction, str]], callbacks: Callbacks = None, **kwargs: Any, ) -> Union[AgentAction, AgentFinish]: """Given input, decided what to do. Args: intermediate_steps: Steps the LLM has taken to date, along with the observations. callbacks: Callbacks to run. **kwargs: User inputs. Returns: Action specifying what tool to use. """ output = self.llm_chain.run( intermediate_steps=intermediate_steps, stop=self.stop, callbacks=callbacks, **kwargs, ) return self.output_parser.parse(output)这里要关注的是plan方法。返回值为 AgentAction 或 AgentFinish。 LLMSingleActionAgent内部调用LLMChain，根据ReAct进行思考的同时决定实际的动作。如果有要采取的操作，LLMChain 将通过 output_parser 返回 AgentAction；如果想法完成，则返回 AgentFinish。这是Agent的主要作用。你可以根据这个来进行自由定制。例如，如果你想让它成为多操作，你可以将其更改为返回多个操作，或者你可以让LLMChain仅在某些条件下工作。 3.4 关于AgentExecutor AgentExecutor的作用是实际执行动作并获取答案。如上面模板示例中所述，根据需要重复执行“Thought”、“Action”、“Action Input”和“Observation”。class AgentExecutor(Chain): """Agent that is using tools.""" def _take_next_step( self, name_to_tool_map: Dict[str, BaseTool], color_mapping: Dict[str, str], inputs: Dict[str, str], intermediate_steps: List[Tuple[AgentAction, str]], run_manager: Optional[CallbackManagerForChainRun] = None, ) -> Union[AgentFinish, List[Tuple[AgentAction, str]]]: """Take a single step in the thought-action-observation loop. Override this to take control of how the agent makes and acts on choices. """ for agent_action in actions: # We then call the tool on the tool input to get an observation observation = tool.run( agent_action.tool_input, verbose=self.verbose, color=color, callbacks=run_manager.get_child() if run_manager else None, **tool_run_kwargs, ) result.append((agent_action, observation)) return resultfor agent_action in actions:部分有一个action循环（换句话说，AgentExecutor本身支持多action），而在observation = tool.run(...)部分，选择的工具实际上是作为一个action来执行的.您可以看到我们正在得到一个Observation（答案）。 4. 总结这次，我们介绍了ReAct的概念，并讲解了ReAct框架是如何在LangChain Agent中的实现。

大模型 ReAct 智仁HRO LangChain Agent