

工具增强的大语言模型智能体浅议
©️ Copyright 2023 @ Authors
作者:
wangz@dp.tech 📨
日期:2023-07-30
共享协议:本作品采用知识共享署名-非商业性使用-相同方式共享 4.0 国际许可协议进行许可。
快速开始:点击上方的 开始连接 按钮,选择 bohrium-notebook镜像及任意GPU节点配置,稍等片刻即可运行。
以 ChatGPT、GPT4 为代表的大语言模型(LLM)的发展给人工智能带来了更多的可能性,每天各种各样的例子都在不断证明着 LLM 的能力之强大,知识之广泛,在很多问题上已经有一定的逻辑能力;同时也会有各种各样的例子得出以幻觉现象为代表的各种各样的问题。
尽管 LLM 拥有巨量的参数,并且在庞大的数据上进行了训练,但是我们逻辑上至少还是可以假定单纯凭借扩大 LLM 的规模还是有很多不好做的事情。例如,目前的 LLM 几乎都是静态的,从训练结束之后新增的知识目前并没有办法存储到 LLM 的权重中;另外 LLM 作为自然语言模型在解决一些特定领域上的表现往往很难打败在特定领域上专门设计的工具(Tool)。
目前来看有一种行之有效的方法是结合 LLM 的逻辑推理能力 与 各种 Tool 的专用能力相组合,利用 LLM 的逻辑推理能力将特定的复杂任务按照一定的逻辑拆解成一系列子任务;保证子任务的复杂度相对可控,可以用特定的 Tool 来解决,将 Tool 的执行结果以结构化的方式返回给 LLM,LLM 依据 Tool 的结果进一步调整自己的推理逻辑,从而通过工具用以解决问题。这样解决问题的工具也被称为大语言模型智能体(Agent)。比较有名的比如AutoGPT,Babyagi 等,当然在结合各种 Tool 的基础上, Agent 往往会有更多的例如提示工程(Prompt Engineering), 长期记忆存储(Long Term memory) 等各种各样的设计。
为做简化,本文选择了一个求解数值问题的场景, 来对比 GPT-3.5、GPT-4 与 Tool Augmented GPT-3.5 Agent 的效果,本文选择用 langchain 来实现。
安装必要的依赖
Looking in indexes: https://pypi.tuna.tsinghua.edu.cn/simple Collecting langchain Downloading https://pypi.tuna.tsinghua.edu.cn/packages/16/0c/c960f9262a6030c91a2dc94ac108a0a87914f4a8e650e936f3a7a53c9055/langchain-0.0.247-py3-none-any.whl (1.4 MB) ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 1.4/1.4 MB 1.7 MB/s eta 0:00:0000:0100:010m Collecting wolframalpha Downloading https://pypi.tuna.tsinghua.edu.cn/packages/e1/83/bc28cd366bdbb5bca68d8442a485fc824d4ea16c358482b488149356d8d3/wolframalpha-5.0.0-py3-none-any.whl (7.5 kB) Collecting openai Downloading https://pypi.tuna.tsinghua.edu.cn/packages/67/78/7588a047e458cb8075a4089d721d7af5e143ff85a2388d4a28c530be0494/openai-0.27.8-py3-none-any.whl (73 kB) ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 73.6/73.6 kB 1.5 MB/s eta 0:00:00-:--:-- Requirement already satisfied: transformers in /opt/conda/lib/python3.8/site-packages (4.27.1) Requirement already satisfied: aiohttp<4.0.0,>=3.8.3 in /opt/conda/lib/python3.8/site-packages (from langchain) (3.8.4) Requirement already satisfied: async-timeout<5.0.0,>=4.0.0 in /opt/conda/lib/python3.8/site-packages (from langchain) (4.0.2) Requirement already satisfied: numpy<2,>=1 in /opt/conda/lib/python3.8/site-packages (from langchain) (1.22.4) Collecting dataclasses-json<0.6.0,>=0.5.7 Downloading https://pypi.tuna.tsinghua.edu.cn/packages/30/85/df2259c0bee64b8c38772c43e849a7c312183d7415934546577614885170/dataclasses_json-0.5.13-py3-none-any.whl (26 kB) Requirement already satisfied: SQLAlchemy<3,>=1.4 in /opt/conda/lib/python3.8/site-packages (from langchain) (1.4.46) Requirement already satisfied: PyYAML>=5.4.1 in /opt/conda/lib/python3.8/site-packages (from langchain) (6.0) Collecting langsmith<0.1.0,>=0.0.11 Downloading https://pypi.tuna.tsinghua.edu.cn/packages/90/04/5a35cd21af329c4152193bec49931fa3a78bd3dae373b26c2fb68e018d56/langsmith-0.0.15-py3-none-any.whl (30 kB) Requirement already satisfied: requests<3,>=2 in /opt/conda/lib/python3.8/site-packages (from langchain) (2.28.2) Requirement already satisfied: tenacity<9.0.0,>=8.1.0 in /opt/conda/lib/python3.8/site-packages (from langchain) (8.2.1) Collecting openapi-schema-pydantic<2.0,>=1.2 Downloading https://pypi.tuna.tsinghua.edu.cn/packages/a8/e7/22abb5a10733bf8142984201aedf27d4a58f5810ebdfe9679f9876c7bf4d/openapi_schema_pydantic-1.2.4-py3-none-any.whl (90 kB) ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 90.0/90.0 kB 2.1 MB/s eta 0:00:00ta 0:00:01 Requirement already satisfied: pydantic<2,>=1 in /opt/conda/lib/python3.8/site-packages (from langchain) (1.10.5) Requirement already satisfied: numexpr<3.0.0,>=2.8.4 in /opt/conda/lib/python3.8/site-packages (from langchain) (2.8.4) Requirement already satisfied: more-itertools in /opt/conda/lib/python3.8/site-packages (from wolframalpha) (9.0.0) Collecting xmltodict Downloading https://pypi.tuna.tsinghua.edu.cn/packages/94/db/fd0326e331726f07ff7f40675cd86aa804bfd2e5016c727fa761c934990e/xmltodict-0.13.0-py2.py3-none-any.whl (10.0 kB) Collecting jaraco.context Downloading https://pypi.tuna.tsinghua.edu.cn/packages/0a/de/3f889cd55e69f0a91b396f6799ca31ea0d6869cde338e7c79335699090cb/jaraco.context-4.3.0-py3-none-any.whl (5.3 kB) Requirement already satisfied: tqdm in /opt/conda/lib/python3.8/site-packages (from openai) (4.64.1) Requirement already satisfied: packaging>=20.0 in /opt/conda/lib/python3.8/site-packages (from transformers) (23.0) Requirement already satisfied: huggingface-hub<1.0,>=0.11.0 in /opt/conda/lib/python3.8/site-packages (from transformers) (0.13.2) Requirement already satisfied: filelock in /opt/conda/lib/python3.8/site-packages (from transformers) (3.9.0) Requirement already satisfied: tokenizers!=0.11.3,<0.14,>=0.11.1 in /opt/conda/lib/python3.8/site-packages (from transformers) (0.13.2) Requirement already satisfied: regex!=2019.12.17 in /opt/conda/lib/python3.8/site-packages (from transformers) (2022.6.2) Requirement already satisfied: multidict<7.0,>=4.5 in /opt/conda/lib/python3.8/site-packages (from aiohttp<4.0.0,>=3.8.3->langchain) (6.0.4) Requirement already satisfied: yarl<2.0,>=1.0 in /opt/conda/lib/python3.8/site-packages (from aiohttp<4.0.0,>=3.8.3->langchain) (1.8.2) Requirement already satisfied: frozenlist>=1.1.1 in /opt/conda/lib/python3.8/site-packages (from aiohttp<4.0.0,>=3.8.3->langchain) (1.3.3) Requirement already satisfied: attrs>=17.3.0 in /opt/conda/lib/python3.8/site-packages (from aiohttp<4.0.0,>=3.8.3->langchain) (22.1.0) Requirement already satisfied: charset-normalizer<4.0,>=2.0 in /opt/conda/lib/python3.8/site-packages (from aiohttp<4.0.0,>=3.8.3->langchain) (3.0.1) Requirement already satisfied: aiosignal>=1.1.2 in /opt/conda/lib/python3.8/site-packages (from aiohttp<4.0.0,>=3.8.3->langchain) (1.3.1) Requirement already satisfied: marshmallow<4.0.0,>=3.18.0 in /opt/conda/lib/python3.8/site-packages (from dataclasses-json<0.6.0,>=0.5.7->langchain) (3.19.0) Requirement already satisfied: typing-inspect<1,>=0.4.0 in /opt/conda/lib/python3.8/site-packages (from dataclasses-json<0.6.0,>=0.5.7->langchain) (0.8.0) Requirement already satisfied: typing-extensions>=3.7.4.3 in /opt/conda/lib/python3.8/site-packages (from huggingface-hub<1.0,>=0.11.0->transformers) (4.5.0) Requirement already satisfied: certifi>=2017.4.17 in /opt/conda/lib/python3.8/site-packages (from requests<3,>=2->langchain) (2022.12.7) Requirement already satisfied: idna<4,>=2.5 in /opt/conda/lib/python3.8/site-packages (from requests<3,>=2->langchain) (3.4) Requirement already satisfied: urllib3<1.27,>=1.21.1 in /opt/conda/lib/python3.8/site-packages (from requests<3,>=2->langchain) (1.26.14) Requirement already satisfied: greenlet!=0.4.17 in /opt/conda/lib/python3.8/site-packages (from SQLAlchemy<3,>=1.4->langchain) (2.0.2) Requirement already satisfied: mypy-extensions>=0.3.0 in /opt/conda/lib/python3.8/site-packages (from typing-inspect<1,>=0.4.0->dataclasses-json<0.6.0,>=0.5.7->langchain) (1.0.0) Installing collected packages: xmltodict, jaraco.context, wolframalpha, openapi-schema-pydantic, langsmith, dataclasses-json, openai, langchain Successfully installed dataclasses-json-0.5.13 jaraco.context-4.3.0 langchain-0.0.247 langsmith-0.0.15 openai-0.27.8 openapi-schema-pydantic-1.2.4 wolframalpha-5.0.0 xmltodict-0.13.0 WARNING: Running pip as the 'root' user can result in broken permissions and conflicting behaviour with the system package manager. It is recommended to use a virtual environment instead: https://pip.pypa.io/warnings/venv
问题定义 与 LLM baseline
我们选择了一个定积分的问题,并且在定积分之后要求定积分值的开方, Please get and get the sqrt of integration result.
我们这里选择 gpt-3.5 作为我们用来对比的大模型,在调用工具方面,gpt-3.5 以及 gpt-4 的结果较好,读者可以把下文中相应的key修改为自己的设置。
langchain 中 定义一个语言模型的行为是通过一个 LLMChain 的对象来实现的。同时为了加强语言模型的效果,我们会设计一个提示提前来告诉语言模型我们接下来的问题会与哪方面有关。
在这里因为是一个数学问题,我们会告诉 LLM 他现在是一个数学专家,并且可以逐步(step by step)的思考问题。
"Sure, I can help you with that!\n\nStep 1: Find the antiderivative of f(x)\nTo integrate f(x) = x^2+sin(x)+1, we need to find its antiderivative. The antiderivative of x^2 is (1/3)x^3, the antiderivative of sin(x) is -cos(x), and the antiderivative of 1 is simply x. Therefore, the antiderivative of f(x) is:\n\nF(x) = (1/3)x^3 - cos(x) + x\n\nStep 2: Evaluate the definite integral\nNow that we have the antiderivative, we can evaluate the definite integral from 0 to 1 by plugging in the limits of integration:\n\n∫[0,1] f(x) dx = F(1) - F(0)\n= [(1/3)(1)^3 - cos(1) + 1] - [(1/3)(0)^3 - cos(0) + 0]\n= (1/3) - cos(1) + 1\n\nStep 3: Take the square root of the integration result\nTo get the square root of the integration result, we simply take the square root of the expression we just found:\n\nsqrt[(1/3) - cos(1) + 1] \n\nAnd that's it! We have integrated f(x) from 0 to 1 and found the square root of the integration result."
上面的结果我们不难看出,GPT-3.5 直接推理得到的结果积分值的计算出现了错误,至于求开方更是没有计算。我们还可以换 GPT-4 试一下结果.
'To integrate the function f(x) = x^2 + sin(x) + 1 from 0 to 1, we first need to find the antiderivative of the function. \n\nThe antiderivative of x^2 is (1/3)x^3, the antiderivative of sin(x) is -cos(x), and the antiderivative of 1 is x. So, the antiderivative of f(x) is F(x) = (1/3)x^3 - cos(x) + x.\n\nNow, we need to evaluate F(x) at the limits of integration, 0 and 1:\n\nF(1) = (1/3)(1)^3 - cos(1) + 1\nF(0) = (1/3)(0)^3 - cos(0) + 0 = -1\n\nNow, subtract F(0) from F(1) to find the definite integral:\n\nIntegral = F(1) - F(0) = [(1/3) - cos(1) + 1] - (-1) = (1/3) - cos(1) + 2\n\nNow, we need to find the square root of the integration result:\n\nsqrt(Integral) = sqrt((1/3) - cos(1) + 2)\n\nThis expression cannot be simplified further without using a calculator to find an approximate value. Using a calculator, we get:\n\nsqrt(Integral) ≈ 1.395'
GPT-4 给出的定积分的结果是正确的,在计算开方时,GPT-4 貌似模拟了一个计算器,计算得到的结果是 1.395,这个误差偏大
工具定义 与 集成
我们依旧选择 GPT-3.5 来集成 Tool 构建我们的 agent。对于定积分这样的数值问题,我们选择 Wolfram Alpha 来计算,其底层集成了 Mathematical,可以做较为复杂的数值运算。Wolfram Alpha 可以个人申请试用版本。
'Assumption: integral_0^1 (x^2 + sin(x) + 1) dx = 7/3 - cos(1)≈1.7930 \nAnswer: integral_0^1 (x^2 + sin(x) + 1) dx = 7/3 - cos(1)≈1.7930'
尽管对于 Wolfram Alpha 来讲,求开方不过是小菜一碟,但是我们这里强行手写一个函数,目的是展示在 langchain 中如何自定义一个 Tool。
对于 langchain 而言,定义一个工具,主要就是需要定义下面的三部分,即工具的名字,工具的调用入口,工具的功能描述;尤其要注意工具的功能描述需要说明这个工具的功能以及他需要的输入是什么。然后把工具组合得到一个列表。
定义好工具,接下来就是需要告诉我们的 LLM 如何使用这些工具,langchain 是借助 agent 这样的对象来实现的。
我们可以看下这个 agent 的提示工程是如何构造的,这基本上显示了 agent 构建的逻辑。
'Answer the following questions as best you can. You have access to the following tools:\n\nwolfram cal: Useful for when you need to answer questions about Math and Science. You should input some algebra, analysis, and function optimization problems.\nfind_sqrt: Useful for when you need to get the sqrt of some value. Input should be a real number\n\nUse the following format:\n\nQuestion: the input question you must answer\nThought: you should always think about what to do\nAction: the action to take, should be one of [wolfram cal, find_sqrt]\nAction Input: the input to the action\nObservation: the result of the action\n... (this Thought/Action/Action Input/Observation can repeat N times)\nThought: I now know the final answer\nFinal Answer: the final answer to the original input question\n\nBegin!\n\nQuestion: {input}\nThought:{agent_scratchpad}'
上面的 prompt 基本代表了 ZERO_SHOT_REACT_DESCRIPTION 这种 agent 的特点,我们简单解读下。首先我们可以看到这段 prompt 要求 LLM 来调用 wolfram cal 与 find_sqrt 两种工具来解决问题。 其解决问题的流程如agent.run("Please integrate f(x) = x^2+sin(x)+1 from 0 to 1 and get the sqrt of integration result.")下: LLM 首先需要分析目前的问题以及已有的观察(observation),给出一个想法(Thought),即考虑目前的行动; 然后采取一个行动(action),action 只可以是 给定的工具; 对给定的 action,考虑所需要的动作输入(action input); 得到相应的observation,即上一步 action 的结果; 重复上面的逻辑 N 次,知道问题得到解决,得到 final answer;
> Entering new AgentExecutor chain... This is a calculus problem that involves integration and finding the square root of the result. I should use the wolfram cal tool. Action: wolfram cal Action Input: integrate x^2+sin(x)+1 from 0 to 1 Observation: Assumption: integral_0^1 (x^2 + sin(x) + 1) dx = 7/3 - cos(1)≈1.7930 Answer: integral_0^1 (x^2 + sin(x) + 1) dx = 7/3 - cos(1)≈1.7930 Thought:Now I need to find the square root of the integration result. I should use the find_sqrt tool. Action: find_sqrt Action Input: 1.7930 Observation: 1.33903 Thought:I now know the final answer. Final Answer: The square root of the integration result of f(x) = x^2+sin(x)+1 from 0 to 1 is approximately 1.33903. > Finished chain.
'The square root of the integration result of f(x) = x^2+sin(x)+1 from 0 to 1 is approximately 1.33903.'
我们看 LLM 首先用了我们定义的 wolfram cal 这个工具来计算定积分,计算得到近似解 1.7930,GPT-4 推断出下面要继续调用 find_sqrt 来计算开方值,最终得到了结果 1.3390。
我们可以稍加修改 prompt,来保证我们的 agent 行为更符合我们的设计预期,例如下面我们加了两点需要注意的地方,保证 LLM 在解决问题的过程中 find_sqrt 来求开方而不是 wolfram cal;并且保证在解决优化问题时我们会调用 LLM。
> Entering new AgentExecutor chain... I need to use calculus to integrate the function and then find the square root of the result. Action: wolfram cal Action Input: integrate x^2+sin(x)+1 from 0 to 1 Observation: Assumption: integral_0^1 (x^2 + sin(x) + 1) dx = 7/3 - cos(1)≈1.7930 Answer: integral_0^1 (x^2 + sin(x) + 1) dx = 7/3 - cos(1)≈1.7930 Thought:Now I need to find the square root of the integration result. Action: find_sqrt Action Input: 1.7930 Observation: 1.33903 Thought:I now know the final answer. Final Answer: The square root of the integration result of f(x) = x^2+sin(x)+1 from 0 to 1 is approximately 1.33903. > Finished chain.
'The square root of the integration result of f(x) = x^2+sin(x)+1 from 0 to 1 is approximately 1.33903.'
结果比较
综合上面可以看到,对比下可以发现我们加入工具之后,我们的Agent 得到了正确的结果,GPT
-3.5 直接推理得到的结果积分值的计算出现了错误,GPT-4 计算的积分值正确,但是相应的开方操作精度较低。
另外在这个问题上,解决的用时也变短了。(笔者自测: agent: 8.6s, gpt-3.5: 13.5s,gpt-4: 22.1s)
另一个例子
我们再测试一个优化相关的例子。
> Entering new AgentExecutor chain... This is a function optimization problem, so I should use wolfram cal. Action: wolfram cal Action Input: maximize xy on x+y=1 Observation: Assumption: maximize | function | x y domain | x + y = 1 Answer: max{x y|x + y = 1} = 1/4 at (x, y) = (1/2, 1/2) Thought:Now I need to get the sqrt of the max value, so I should use find_sqrt. Action: find_sqrt Action Input: 1/4 Observation: 0.5 Thought:I now know the final answer. Final Answer: 0.5 > Finished chain.
'0.5'
'To solve this problem, we need to use the method of Lagrange multipliers. \n\nStep 1: Define the function we want to maximize: \n\nf(x,y) = xy \n\nStep 2: Define the constraint: \n\ng(x,y) = x + y - 1 = 0 \n\nStep 3: Set up the Lagrangian: \n\nL(x,y,λ) = f(x,y) - λg(x,y) \n\nL(x,y,λ) = xy - λ(x + y - 1) \n\nStep 4: Find the partial derivatives of L with respect to x, y, and λ: \n\n∂L/∂x = y - λ \n\n∂L/∂y = x - λ \n\n∂L/∂λ = x + y - 1 \n\nStep 5: Set the partial derivatives equal to zero and solve for x, y, and λ: \n\ny - λ = 0 \n\nx - λ = 0 \n\nx + y - 1 = 0 \n\nSolving these equations simultaneously, we get: \n\nx = y = 1/2 \n\nλ = 1/2 \n\nStep 6: Plug in the values of x and y into the original function to find the maximum value: \n\nz = f(x,y) = xy = (1/2)(1/2) = 1/4 \n\nStep 7: Take the square root of z to get the final answer: \n\nsqrt(z) = sqrt(1/4) = 1/2 \n\nTherefore, the maximum value of xy on x+y=1 is 1/4, and the square root of the maximum value is 1/2.'
'To maximize the function xy subject to the constraint x + y = 1, we can first express y in terms of x using the constraint:\n\ny = 1 - x\n\nNow, we can rewrite the function to be maximized as:\n\nz = x(1 - x)\n\nTo find the maximum value of z, we can take the derivative of z with respect to x and set it equal to 0:\n\ndz/dx = (1 - x) - x = 1 - 2x\n\nSetting the derivative equal to 0:\n\n1 - 2x = 0\n\nSolving for x:\n\nx = 1/2\n\nNow, we can find the corresponding value of y using the constraint:\n\ny = 1 - x = 1 - 1/2 = 1/2\n\nSo, the maximum value of z occurs when x = 1/2 and y = 1/2. Now, we can find the maximum value of z:\n\nz = x * y = (1/2) * (1/2) = 1/4\n\nFinally, we can find the square root of the maximum value of z:\n\nsqrt(z) = sqrt(1/4) = 1/2'
结果比较
在上面这个问题中,gpt-3.5, gpt-4 以及 agent 都得到了正确的结果。
在用时方面,相比较而言,agent的运行时间最短(7.2s),gpt-3.5 用时次之(10.5s), gpt-4 用时最长(16.1s); 有趣的是,gpt-3.5 选择了 拉格朗日乘子法,而 gpt-4 则选择了直接换元,而agent 调用了 Wolframe 来求解约束优化问题,并不需要 LLM 来参与优化问题的求解,只需要 LLM 意识到这个约束优化问题可以用 Wolframe 来解决即可,用时最短。
结语
本文在 gpt-3.5 的基础上定义了 Wolframe 和 自定义的求根算法增强的 Agent,在两个求积分以及一个优化问题上我们构建的 Agent 与 GPT-3.5 以及 GPT-4 相比取得了更好的效果。当然本文的测试也难免挂一漏万,所定义的 Tool 也选择了较为简单的设置,但也看出 LLM + Tool 的潜力。相信不久的将来,更为复杂以及精巧的 LLM + Tool 的组合一定可以解决越来越多的问题。
参考资料:
1、https://lilianweng.github.io/posts/2023-06-23-agent/
2、https://products.wolframalpha.com/api/
3、https://github.com/langchain-ai/langchain/tree/14aa27b5f4658c10a36c7e1b8e5582d5a8ad4e6a/libs/experimental/langchain_experimental/autonomous_agents/baby_agi
4、https://python.langchain.com/docs/integrations/tools/wolfram_alpha
5、https://python.langchain.com/docs/modules/agents/tools/custom_tools








Siyuan Liu