2026-03-27 13:42:39 网络安全文章来源：ZONE.CI 全球网 0 阅读模式

文章总结： 本文主要介绍了AIAgent的三种主流工作范式，特别是ReAct。它解释了ReAct如何将思考与行动交织进行，通过Thought、Action、Observation的循环来完成任务，并分析了其相比纯思考或纯行动模式的优势，最后还提供了一个使用SerpApi实现ReAct智能体的代码示例。 综合评分： 85 文章分类： AI安全,技术标准,解决方案,安全开发,其他

cover_image

从零玩转 AI Agent —— 4：Agent 到底是怎么”思考”和”行动”的？

原创

Claude Code Claude Code

Crush Sec

2026年3月21日 10:30 广东

上一期我们搞清楚了 LLM 的工作原理，知道了它本质上是一个”预测下一个词”的机器。但光有一个会说话的大脑还不够——Agent 怎么把这个大脑和外部世界连起来？当任务变复杂，它又是怎么组织自己的思考和行动的？

引子：同一个任务，三种不同的”做法”

给 Agent 一个任务：“帮我找一下最新款的华为手机，总结一下卖点。”

这个问题 LLM 凭自己的知识库回答不了（信息太新），必须借助工具。但怎么用工具，不同的 Agent 范式差异巨大：

ounter(lineounter(lineounter(lineReAct： &nbsp;边想边搜 → 看到结果 → 再想下一步 → 循环到有答案为止Plan-and-Solve：先列步骤 → 拆成子任务 → 逐步执行 → 汇总结果Reflection：生成初稿 → 自我审查 → 发现不足 → 修改优化 → 再审查

这三种范式，是今天几乎所有主流 Agent 框架（LangChain、AutoGen、Dify）的设计原型。搞清楚它们，你就能看懂那些框架在背后做了什么。

一、ReAct：边想边做

核心思想

ReAct（Reasoning + Acting）的核心只有一句话：

把”思考”和”行动”显式地交织在一起，每次行动之后看结果、再思考、再行动。

每一步都遵循固定的轨迹：

ounter(lineounter(lineounter(lineounter(lineThought → Action → Observation&nbsp; &nbsp;↑ &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;|&nbsp; &nbsp;└──────────────────────┘&nbsp; &nbsp; &nbsp; &nbsp; &nbsp;（循环直到完成）

为什么这样设计有效？

纯思考（Chain-of-Thought）：会推理，但和外部世界完全隔离，容易产生幻觉
纯行动：直接调用工具，但没有规划，遇到复杂问题会乱套
ReAct：推理指导行动，行动结果修正推理——两者相辅相成

完整实现

下面是一个完整的 ReAct 智能体，使用 SerpApi 做实时搜索：

ounter(lineounter(lineounter(lineounter(lineounter(lineounter(lineounter(lineounter(lineounter(lineounter(lineounter(lineounter(lineounter(lineounter(lineounter(lineounter(lineounter(lineounter(lineounter(lineounter(lineounter(lineounter(lineounter(lineounter(lineounter(lineounter(lineounter(lineounter(lineounter(lineounter(lineounter(lineounter(lineounter(lineounter(lineounter(lineounter(lineounter(lineounter(lineounter(lineounter(lineounter(lineounter(lineounter(lineounter(lineounter(lineounter(lineounter(lineounter(lineounter(lineounter(lineounter(lineounter(lineounter(lineounter(lineounter(lineounter(lineounter(lineounter(lineounter(lineounter(lineounter(lineounter(lineounter(lineounter(lineounter(lineounter(lineounter(lineounter(lineounter(lineounter(lineounter(lineounter(lineounter(lineounter(lineounter(lineounter(lineounter(lineounter(lineounter(lineounter(lineounter(lineounter(lineounter(lineounter(lineounter(lineounter(lineounter(lineounter(lineounter(lineounter(lineounter(lineounter(lineounter(lineounter(lineounter(lineounter(lineounter(lineounter(lineounter(lineounter(lineounter(lineounter(lineounter(lineounter(lineounter(lineounter(lineounter(lineounter(lineounter(lineounter(lineounter(lineounter(lineounter(lineounter(lineounter(lineounter(lineounter(lineounter(lineounter(lineounter(lineounter(lineounter(lineounter(lineounter(lineounter(lineounter(lineounter(lineounter(lineounter(lineounter(lineounter(lineounter(lineounter(lineounter(lineounter(lineounter(lineounter(lineounter(lineounter(lineounter(lineounter(lineounter(lineounter(lineounter(lineounter(lineounter(lineounter(lineounter(lineounter(lineounter(lineounter(lineounter(lineounter(lineounter(lineimport osimport refrom openai import OpenAIfrom dotenv import load_dotenvfrom typing import List, Dict, Any
load_dotenv()
# ============================================================# 第一部分：LLM 客户端# ============================================================class HelloAgentsLLM:&nbsp; &nbsp; def __init__(self):&nbsp; &nbsp; &nbsp; &nbsp; self.model = os.getenv("LLM_MODEL_ID")&nbsp; &nbsp; &nbsp; &nbsp; api_key &nbsp; &nbsp;= os.getenv("LLM_API_KEY")&nbsp; &nbsp; &nbsp; &nbsp; base_url &nbsp; = os.getenv("LLM_BASE_URL")&nbsp; &nbsp; &nbsp; &nbsp; self.client = OpenAI(api_key=api_key, base_url=base_url)
&nbsp; &nbsp; def think(self, messages: List[Dict], temperature: float = 0) -> str:&nbsp; &nbsp; &nbsp; &nbsp; response = self.client.chat.completions.create(&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; model=self.model, messages=messages,&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; temperature=temperature, stream=True&nbsp; &nbsp; &nbsp; &nbsp; )&nbsp; &nbsp; &nbsp; &nbsp; chunks = []&nbsp; &nbsp; &nbsp; &nbsp; for chunk in response:&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; content = chunk.choices[0].delta.content or ""&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; print(content, end="", flush=True)&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; chunks.append(content)&nbsp; &nbsp; &nbsp; &nbsp; print()&nbsp; &nbsp; &nbsp; &nbsp; return "".join(chunks)
# ============================================================# 第二部分：工具定义与管理器# ============================================================def search(query: str) -> str:&nbsp; &nbsp; """调用 SerpApi 执行 Google 搜索，返回结构化结果"""&nbsp; &nbsp; from serpapi import SerpApiClient&nbsp; &nbsp; params = {&nbsp; &nbsp; &nbsp; &nbsp; "engine": "google", "q": query,&nbsp; &nbsp; &nbsp; &nbsp; "api_key": os.getenv("SERPAPI_API_KEY"),&nbsp; &nbsp; &nbsp; &nbsp; "gl": "cn", "hl": "zh-cn"&nbsp; &nbsp; }&nbsp; &nbsp; results = SerpApiClient(params).get_dict()&nbsp; &nbsp; if "answer_box" in results and "answer" in results["answer_box"]:&nbsp; &nbsp; &nbsp; &nbsp; return results["answer_box"]["answer"]&nbsp; &nbsp; if "organic_results" in results:&nbsp; &nbsp; &nbsp; &nbsp; snippets = [&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; f"[{i+1}] {r.get('title','')}\n{r.get('snippet','')}"&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; for i, r in enumerate(results["organic_results"][:3])&nbsp; &nbsp; &nbsp; &nbsp; ]&nbsp; &nbsp; &nbsp; &nbsp; return "\n\n".join(snippets)&nbsp; &nbsp; return f"未找到关于 '{query}' 的信息。"
class ToolExecutor:&nbsp; &nbsp; def __init__(self):&nbsp; &nbsp; &nbsp; &nbsp; self.tools: Dict[str, Dict[str, Any]] = {}
&nbsp; &nbsp; def register(self, name: str, description: str, func: callable):&nbsp; &nbsp; &nbsp; &nbsp; self.tools[name] = {"description": description, "func": func}
&nbsp; &nbsp; def get_func(self, name: str):&nbsp; &nbsp; &nbsp; &nbsp; return self.tools.get(name, {}).get("func")
&nbsp; &nbsp; def list_tools(self) -> str:&nbsp; &nbsp; &nbsp; &nbsp; return "\n".join(&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; f"- {n}: {info['description']}" for n, info in self.tools.items()&nbsp; &nbsp; &nbsp; &nbsp; )
# ============================================================# 第三部分：ReAct 提示词模板# ============================================================REACT_PROMPT = """你是一个可以调用外部工具的智能助手。
可用工具:{tools}
请严格按照以下格式回应:Thought: 你的思考过程Action: 工具名[工具输入] &nbsp;或 &nbsp;Finish[最终答案]
当你已有足够信息时，使用 Finish[答案] 结束。
Question: {question}History: {history}"""
# ============================================================# 第四部分：ReAct 智能体核心循环# ============================================================class ReActAgent:&nbsp; &nbsp; def __init__(self, llm: HelloAgentsLLM, tools: ToolExecutor, max_steps: int = 5):&nbsp; &nbsp; &nbsp; &nbsp; self.llm = llm&nbsp; &nbsp; &nbsp; &nbsp; self.tools = tools&nbsp; &nbsp; &nbsp; &nbsp; self.max_steps = max_steps
&nbsp; &nbsp; def run(self, question: str) -> str:&nbsp; &nbsp; &nbsp; &nbsp; history = []&nbsp; &nbsp; &nbsp; &nbsp; for step in range(1, self.max_steps + 1):&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; print(f"\n--- 第 {step} 步 ---")&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; prompt = REACT_PROMPT.format(&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; tools=self.tools.list_tools(),&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; question=question,&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; history="\n".join(history) or "（无）"&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; )&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; response = self.llm.think([{"role": "user", "content": prompt}])
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; # 解析 Thought 和 Action&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; thought_m = re.search(r"Thought:\s*(.*?)(?=\nAction:|$)", response, re.DOTALL)&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; action_m &nbsp;= re.search(r"Action:\s*(.*?)$", response, re.DOTALL)&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; thought = thought_m.group(1).strip() if thought_m else ""&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; action &nbsp;= action_m.group(1).strip() &nbsp;if action_m &nbsp;else ""
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; if thought:&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; print(f"思考: {thought}")&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; if not action:&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; break
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; # 检查是否完成&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; if action.startswith("Finish"):&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; match = re.match(r"Finish\[(.*)\]", action, re.DOTALL)&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; final = match.group(1) if match else action&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; print(f"\n最终答案: {final}")&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; return final
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; # 执行工具&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; tool_match = re.match(r"(\w+)\[(.*)\]", action, re.DOTALL)&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; if not tool_match:&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; continue&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; tool_name, tool_input = tool_match.group(1), tool_match.group(2)&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; print(f"行动: {tool_name}[{tool_input}]")&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; func = self.tools.get_func(tool_name)&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; observation = func(tool_input) if func else f"未找到工具 '{tool_name}'"&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; print(f"观察: {observation}")
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; history.append(f"Action: {action}")&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; history.append(f"Observation: {observation}")
&nbsp; &nbsp; &nbsp; &nbsp; print("已达到最大步数，流程终止。")&nbsp; &nbsp; &nbsp; &nbsp; return ""
# ============================================================# 运行入口# ============================================================if __name__ == "__main__":&nbsp; &nbsp; llm = HelloAgentsLLM()&nbsp; &nbsp; tools = ToolExecutor()&nbsp; &nbsp; tools.register(&nbsp; &nbsp; &nbsp; &nbsp; "Search",&nbsp; &nbsp; &nbsp; &nbsp; "网页搜索引擎。回答时事、实时信息时使用此工具。",&nbsp; &nbsp; &nbsp; &nbsp; search&nbsp; &nbsp; )&nbsp; &nbsp; agent = ReActAgent(llm, tools, max_steps=5)&nbsp; &nbsp; agent.run("华为最新的手机是哪一款？它的主要卖点是什么？")

运行效果：

ounter(lineounter(lineounter(lineounter(lineounter(lineounter(lineounter(lineounter(lineounter(lineounter(lineounter(lineounter(lineounter(lineounter(lineounter(lineounter(lineounter(lineounter(lineounter(lineounter(lineounter(lineounter(lineounter(lineounter(lineounter(lineounter(lineounter(lineounter(lineounter(lineounter(lineounter(lineounter(lineounter(lineounter(lineounter(lineounter(lineounter(lineounter(lineounter(lineounter(lineounter(lineounter(lineounter(lineounter(lineounter(lineounter(lineounter(lineounter(lineounter(lineounter(lineounter(lineounter(lineounter(lineounter(line--- 第 1 步 ---Thought: 用户询问华为最新的手机型号及其主要卖点，这属于时事信息，需要使用搜索工具获取最新数据。Action: Search[华为最新手机 发布]思考: 用户询问华为最新的手机型号及其主要卖点，这属于时事信息，需要使用搜索工具获取最新数据。行动: Search[华为最新手机 发布]观察: [1] 华为手机- 华为官网智能手机 ; Mate 系列. 非凡旗舰. HUAWEI Mate 80 Pro Max. ￥7999 起 ; Pura 系列. 先锋影像. HUAWEI Pura 80 Pro+. ￥7999 起 ; Pocket 系列. 美学新篇. HUAWEI Pocket 2 优享 ...
[2] 华为发布会2025.11.25华为Mate 80 系列| Mate X7 及全场景新品发布会华为全新发布Mate 80 系列、Mate X7、MatePad Edge、WATCH Ultimate 2、华为智慧屏MateTV Max、华为路由X3 Pro ...
[3] 2026年华为手机各系列介绍及选购指南（3月份更新） ...华为今年新机都只有纯血鸿蒙系统版本可选，介意者可以考虑荣耀等其他品牌。 近期更新：已更新12月22日发布的华为nova15系列。已更新11月25日发布的华为Mate80系列。
--- 第 2 步 ---Thought: 从之前的搜索结果可以看出，华为最新的手机是 Mate 80 系列，于2025年11月25日发布。但目前的信息只显示了价格和简单的宣传语（"非凡旗舰"），我需要搜索更多关于这款手机的详细配置和卖点信息。
Action: Search[华为Mate 80 Pro 配置 卖点]思考: 从之前的搜索结果可以看出，华为最新的手机是 Mate 80 系列，于2025年11月25日发布。但目前的信息只显示了价格和简单的宣传语（"非凡旗舰"），我需要搜索更多关于这款手机的详细配置和卖点信息。行动: Search[华为Mate 80 Pro 配置 卖点]观察: [1] 华为Mate 80系曝光汇总王炸升级全系配备硬件规格方面，Mate 80 系列几乎确定将首发新旗舰处理器，电池则是6000mAh 的硅碳负极方案，系统预装鸿蒙OS 6，并且继续支持卫星通信，这些核心卖点基本是稳了 ...
[2] HUAWEI Mate 80 Pro 参数规格- 华为官网HUAWEI Mate 80 Pro 参数规格 · 屏幕. 尺寸. 6.75 英寸 · 处理器. 麒麟9030 Pro（16 GB RAM 版本）. 麒麟9030（12 GB RAM 版本） · 操作系统. HarmonyOS 6.0 · 存储. 运行内存（RAM）.
[3] 华为Mate 80 Pro Max 为什么值得买？看完这一篇就够了！近期价格降至7761元左右，性价比凸显。 · 搭载旗舰级多摄组合，长焦和微距能力突出。 · 6.9英寸双层OLED直屏，8000nit峰值亮度是核心亮点。 · 麒麟9030 Pro芯片 ...
--- 第 3 步 ---Thought: 根据搜索结果，华为最新发布的手机系列是 Mate 80 系列，于 2025 年 11 月 25 日发布。其中顶级型号为 Mate 80 Pro 和 Mate 80 Pro Max。主要卖点包括最新的麒麟 9030 系列芯片、HarmonyOS 6.0 系统、高亮度的双层 OLED 屏幕、大容量硅碳负极电池以及旗舰级影像系统和卫星通信功能。
Action: Finish[华为最新的手机是 **HUAWEI Mate 80 系列**（含 Mate 80 Pro 和 Mate 80 Pro Max 等），该系列于 2025 年 11 月 25 日发布。
其主要卖点如下：1. &nbsp;**旗舰性能**：首发搭载新一代麒麟 9030 及 9030 Pro 处理器。2. &nbsp;**最新系统**：预装 HarmonyOS 6.0（纯血鸿蒙系统）。3. &nbsp;**顶级屏幕**：Mate 80 Pro Max 配备 6.9 英寸双层 OLED 直屏，峰值亮度高达 8000nit。4. &nbsp;**超长续航**：全系配备 6000mAh 硅碳负极大电池。5. &nbsp;**影像能力**：采用旗舰级多摄组合，长焦和微距拍摄能力突出。6. &nbsp;**通信功能**：支持卫星通信功能。7. &nbsp;**价格**：起售价为 7999 元。]思考: 根据搜索结果，华为最新发布的手机系列是 Mate 80 系列，于 2025 年 11 月 25 日发布。其中顶级型号为 Mate 80 Pro 和 Mate 80 Pro Max。主要卖点包括最新的麒麟 9030 系列芯片、HarmonyOS 6.0 系统、高亮度的双层 OLED 屏幕、大容量硅碳负极电池以及旗舰级影像系统和卫星通信功能。
最终答案: 华为最新的手机是 **HUAWEI Mate 80 系列**（含 Mate 80 Pro 和 Mate 80 Pro Max 等），该系列于 2025 年 11 月 25 日发布。
其主要卖点如下：1. &nbsp;**旗舰性能**：首发搭载新一代麒麟 9030 及 9030 Pro 处理器。2. &nbsp;**最新系统**：预装 HarmonyOS 6.0（纯血鸿蒙系统）。3. &nbsp;**顶级屏幕**：Mate 80 Pro Max 配备 6.9 英寸双层 OLED 直屏，峰值亮度高达 8000nit。4. &nbsp;**超长续航**：全系配备 6000mAh 硅碳负极大电池。5. &nbsp;**影像能力**：采用旗舰级多摄组合，长焦和微距拍摄能力突出。6. &nbsp;**通信功能**：支持卫星通信功能。7. &nbsp;**价格**：起售价为 7999 元。

这段代码的三个核心：

① 提示词定义了整个交互规范REACT_PROMPT 告诉 LLM 用什么格式输出——这是让机器理解指令的”合同”。格式越清晰，解析越稳定。

② 正则表达式解析 Thought/ActionLLM 输出是纯文本，re.search 负责从中提取结构化信息。这是 ReAct 实现中最脆弱的环节——模型不遵守格式时就会出错。

③ history 列表是 Agent 的”短期记忆”每一步的 Action 和 Observation 都被追加进去，下一步的 LLM 调用能”看到”所有历史——这就是 ReAct 能动态纠错的根本原因。

二、Plan-and-Solve：先规划，再执行

核心思想

ReAct 是”走一步看一步”，适合探索性任务。但有些任务结构清晰、路径确定——比如一道多步数学题——这时候更好的策略是：

先把整个解题路径规划出来，再严格按步骤执行。

两阶段工作流：

ounter(lineounter(lineounter(lineounter(lineounter(lineounter(lineounter(line第一阶段（规划） &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;第二阶段（执行）───────────────── &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;────────────────────────────接收问题 &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; 步骤 1 → 结果 1&nbsp; &nbsp; ↓ &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;↓LLM 拆解成 [步骤1, 步骤2, ...] &nbsp;步骤 2 + 结果1 → 结果 2&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; ↓&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;步骤 3 + 结果2 → 最终答案

每个执行步骤都能”看到”前面的结果，保证信息在链条中顺畅传递。

核心实现

ounter(lineounter(lineounter(lineounter(lineounter(lineounter(lineounter(lineounter(lineounter(lineounter(lineounter(lineounter(lineounter(lineounter(lineounter(lineounter(lineounter(lineounter(lineounter(lineounter(lineounter(lineounter(lineounter(lineounter(lineounter(lineounter(lineounter(lineounter(lineounter(lineounter(lineounter(lineounter(lineounter(lineounter(lineounter(lineounter(lineounter(lineounter(lineounter(lineounter(lineounter(lineounter(lineounter(lineounter(lineounter(lineounter(lineounter(lineounter(lineounter(lineounter(lineounter(lineounter(lineounter(lineounter(lineounter(lineounter(lineounter(lineounter(lineounter(lineounter(lineounter(lineounter(lineounter(lineounter(lineounter(lineounter(lineounter(lineounter(lineounter(lineounter(lineounter(lineounter(lineounter(lineounter(lineounter(lineounter(lineounter(lineounter(lineounter(lineounter(lineounter(lineounter(lineounter(lineounter(lineounter(lineounter(lineounter(lineounter(lineounter(lineounter(lineounter(lineounter(lineimport astimport osfrom openai import OpenAIfrom dotenv import load_dotenv
load_dotenv()
# ============================================================# 第一部分：规划器 —— 把问题拆成步骤列表# ============================================================PLANNER_PROMPT = """你是一个顶级的 AI 规划专家。将以下问题分解成逻辑清晰的步骤。输出必须是 Python 列表格式，用 ```python ... ``` 包裹，例如：["步骤1", "步骤2", ...]
问题: {question}"""
class Planner:&nbsp; &nbsp; def __init__(self, llm):&nbsp; &nbsp; &nbsp; &nbsp; self.llm = llm
&nbsp; &nbsp; def plan(self, question: str) -> list:&nbsp; &nbsp; &nbsp; &nbsp; prompt = PLANNER_PROMPT.format(question=question)&nbsp; &nbsp; &nbsp; &nbsp; response = self.llm.think([{"role": "user", "content": prompt}])&nbsp; &nbsp; &nbsp; &nbsp; try:&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; plan_str = response.split("```python")[1].split("```")[0].strip()&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; return ast.literal_eval(plan_str)&nbsp; &nbsp; &nbsp; &nbsp; except Exception as e:&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; print(f"解析计划失败: {e}")&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; return []
# ============================================================# 第二部分：执行器 —— 逐步执行，传递中间结果# ============================================================EXECUTOR_PROMPT = """你是一位顶级的 AI 执行专家。严格按照计划逐步解决问题。
原始问题: {question}完整计划: {plan}历史步骤与结果: {history}当前步骤: {current_step}
请仅输出当前步骤的答案，不要额外解释。"""
class Executor:&nbsp; &nbsp; def __init__(self, llm):&nbsp; &nbsp; &nbsp; &nbsp; self.llm = llm
&nbsp; &nbsp; def execute(self, question: str, plan: list) -> str:&nbsp; &nbsp; &nbsp; &nbsp; history = ""&nbsp; &nbsp; &nbsp; &nbsp; last_result = ""&nbsp; &nbsp; &nbsp; &nbsp; for i, step in enumerate(plan):&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; print(f"\n→ 步骤 {i+1}/{len(plan)}: {step}")&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; prompt = EXECUTOR_PROMPT.format(&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; question=question, plan=plan,&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; history=history or "无",&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; current_step=step&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; )&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; last_result = self.llm.think([{"role": "user", "content": prompt}])&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; history += f"步骤 {i+1}: {step}\n结果: {last_result}\n\n"&nbsp; &nbsp; &nbsp; &nbsp; return last_result
# ============================================================# 第三部分：整合——先规划，后执行# ============================================================class PlanAndSolveAgent:&nbsp; &nbsp; def __init__(self, llm):&nbsp; &nbsp; &nbsp; &nbsp; self.planner &nbsp;= Planner(llm)&nbsp; &nbsp; &nbsp; &nbsp; self.executor = Executor(llm)
&nbsp; &nbsp; def run(self, question: str):&nbsp; &nbsp; &nbsp; &nbsp; print(f"\n问题: {question}")&nbsp; &nbsp; &nbsp; &nbsp; plan = self.planner.plan(question)&nbsp; &nbsp; &nbsp; &nbsp; if not plan:&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; print("无法生成计划，终止。")&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; return&nbsp; &nbsp; &nbsp; &nbsp; print(f"\n生成计划: {plan}")&nbsp; &nbsp; &nbsp; &nbsp; final = self.executor.execute(question, plan)&nbsp; &nbsp; &nbsp; &nbsp; print(f"\n最终答案: {final}")
# ============================================================# 运行入口# ============================================================if __name__ == "__main__":&nbsp; &nbsp; from chapter4_llm import HelloAgentsLLM &nbsp;# 复用第一部分的 LLM 客户端&nbsp; &nbsp; llm = HelloAgentsLLM()&nbsp; &nbsp; agent = PlanAndSolveAgent(llm)&nbsp; &nbsp; agent.run(&nbsp; &nbsp; &nbsp; &nbsp; "一个水果店周一卖了15个苹果，周二卖了周一的两倍，"&nbsp; &nbsp; &nbsp; &nbsp; "周三比周二少5个。三天总共卖了多少个？"&nbsp; &nbsp; )

运行效果：

ounter(lineounter(lineounter(lineounter(lineounter(lineounter(lineounter(lineounter(lineounter(lineounter(lineounter(lineounter(lineounter(lineounter(lineounter(lineounter(lineounter(lineounter(lineounter(line问题: 一个水果店周一卖了15个苹果...
生成计划: [&nbsp; "计算周一销量：15个",&nbsp; "计算周二销量：15 × 2 = 30个",&nbsp; "计算周三销量：30 - 5 = 25个",&nbsp; "计算总销量：15 + 30 + 25 = 70个"]
→ 步骤 1/4: 计算周一销量：15个15→ 步骤 2/4: 计算周二销量：15 × 2 = 30个30→ 步骤 3/4: 计算周三销量：30 - 5 = 25个25→ 步骤 4/4: 计算总销量：15 + 30 + 25 = 70个70
最终答案: 70

这段代码的三个核心：

① 规划器强制输出 Python 列表用代码块格式包裹输出，然后用 ast.literal_eval 安全解析——比解析自然语言稳定得多。

② 执行器的 history 传递中间结果每执行完一步，把”步骤+结果”追加进 history，下一步的 LLM 调用能直接用上——这就是多步推理不出错的关键。

③ Planner 和 Executor 职责分离规划器只管”拆问题”，执行器只管”跑步骤”。职责分离让代码更清晰，也更容易替换其中一个组件。

三、Reflection：生成 → 反思 → 优化

核心思想

前两种范式完成任务就结束了。但初次生成的结果不一定是最好的——正如我们写完代码会 code review，写完文章会修改。

Reflection 范式引入了一个”内部评审员”：

ounter(lineounter(lineounter(lineounter(lineounter(lineounter(lineounter(line执行（生成初稿）&nbsp; &nbsp; &nbsp;↓反思（评审员找问题）──→ "无需改进" → 结束&nbsp; &nbsp; &nbsp;↓优化（根据反馈修改）&nbsp; &nbsp; &nbsp;↓再次反思 …（循环直到满意或达到上限）

完整实现

以”生成素数查找函数并持续优化”为例：

ounter(lineounter(lineounter(lineounter(lineounter(lineounter(lineounter(lineounter(lineounter(lineounter(lineounter(lineounter(lineounter(lineounter(lineounter(lineounter(lineounter(lineounter(lineounter(lineounter(lineounter(lineounter(lineounter(lineounter(lineounter(lineounter(lineounter(lineounter(lineounter(lineounter(lineounter(lineounter(lineounter(lineounter(lineounter(lineounter(lineounter(lineounter(lineounter(lineounter(lineounter(lineounter(lineounter(lineounter(lineounter(lineounter(lineounter(lineounter(lineounter(lineounter(lineounter(lineounter(lineounter(lineounter(lineounter(lineounter(lineounter(lineounter(lineounter(lineounter(lineounter(lineounter(lineounter(lineounter(lineounter(lineounter(lineounter(lineounter(lineounter(lineounter(lineounter(lineounter(lineounter(lineounter(lineounter(lineounter(lineounter(lineounter(lineounter(lineounter(lineounter(lineounter(lineounter(lineounter(lineounter(lineounter(lineounter(lineounter(lineounter(lineounter(lineounter(lineounter(lineounter(lineounter(lineounter(lineounter(lineounter(lineounter(lineounter(lineounter(lineounter(lineounter(lineounter(lineounter(lineounter(lineounter(lineounter(lineounter(lineounter(lineounter(linefrom typing import Optional, List, Dict, Any
# ============================================================# 第一部分：短期记忆模块 —— 存储执行轨迹# ============================================================class Memory:&nbsp; &nbsp; def __init__(self):&nbsp; &nbsp; &nbsp; &nbsp; self.records: List[Dict[str, str]] = []
&nbsp; &nbsp; def add(self, record_type: str, content: str):&nbsp; &nbsp; &nbsp; &nbsp; self.records.append({"type": record_type, "content": content})&nbsp; &nbsp; &nbsp; &nbsp; print(f"记忆已更新 [{record_type}]")
&nbsp; &nbsp; def get_trajectory(self) -> str:&nbsp; &nbsp; &nbsp; &nbsp; parts = []&nbsp; &nbsp; &nbsp; &nbsp; for r in self.records:&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; label = "上一轮代码" if r["type"] == "execution" else "评审反馈"&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; parts.append(f"--- {label} ---\n{r['content']}")&nbsp; &nbsp; &nbsp; &nbsp; return "\n\n".join(parts)
&nbsp; &nbsp; def last_execution(self) -> Optional[str]:&nbsp; &nbsp; &nbsp; &nbsp; for r in reversed(self.records):&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; if r["type"] == "execution":&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; return r["content"]&nbsp; &nbsp; &nbsp; &nbsp; return None
# ============================================================# 第二部分：三套提示词（初始执行 / 反思 / 优化）# ============================================================INITIAL_PROMPT = """你是一位资深 Python 程序员。根据以下要求编写函数，包含完整签名、文档字符串，遵循 PEP 8。
要求: {task}
直接输出代码，不要额外解释。"""
REFLECT_PROMPT = """你是一位极其严格的代码评审专家，专注于算法效率。
原始任务: {task}待审查代码:{code}
分析时间复杂度，指出算法瓶颈，给出具体改进建议。如果已经最优，回复"无需改进"。
直接输出反馈，不要额外解释。"""
REFINE_PROMPT = """你是一位资深 Python 程序员，正在根据评审意见优化代码。
原始任务: {task}上一版代码: {last_code}评审意见: {feedback}
根据评审意见生成优化后的代码，直接输出，不要额外解释。"""
# ============================================================# 第三部分：Reflection 智能体核心循环# ============================================================class ReflectionAgent:&nbsp; &nbsp; def __init__(self, llm, max_iterations: int = 3):&nbsp; &nbsp; &nbsp; &nbsp; self.llm = llm&nbsp; &nbsp; &nbsp; &nbsp; self.memory = Memory()&nbsp; &nbsp; &nbsp; &nbsp; self.max_iterations = max_iterations
&nbsp; &nbsp; def _call(self, prompt: str) -> str:&nbsp; &nbsp; &nbsp; &nbsp; return self.llm.think([{"role": "user", "content": prompt}]) or ""
&nbsp; &nbsp; def run(self, task: str) -> str:&nbsp; &nbsp; &nbsp; &nbsp; print(f"\n任务: {task}")
&nbsp; &nbsp; &nbsp; &nbsp; # 初始执行&nbsp; &nbsp; &nbsp; &nbsp; print("\n--- 初始执行 ---")&nbsp; &nbsp; &nbsp; &nbsp; initial_code = self._call(INITIAL_PROMPT.format(task=task))&nbsp; &nbsp; &nbsp; &nbsp; self.memory.add("execution", initial_code)
&nbsp; &nbsp; &nbsp; &nbsp; # 迭代循环：反思 → 优化&nbsp; &nbsp; &nbsp; &nbsp; for i in range(self.max_iterations):&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; print(f"\n--- 第 {i+1}/{self.max_iterations} 轮迭代 ---")
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; last_code = self.memory.last_execution()&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; feedback &nbsp;= self._call(REFLECT_PROMPT.format(task=task, code=last_code))&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; self.memory.add("reflection", feedback)
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; if "无需改进" in feedback:&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; print("反思认为代码已足够好，停止迭代。")&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; break
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; refined = self._call(REFINE_PROMPT.format(&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; task=task, last_code=last_code, feedback=feedback&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; ))&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; self.memory.add("execution", refined)
&nbsp; &nbsp; &nbsp; &nbsp; final = self.memory.last_execution()&nbsp; &nbsp; &nbsp; &nbsp; print(f"\n最终代码:\n{final}")&nbsp; &nbsp; &nbsp; &nbsp; return final
# ============================================================# 运行入口# ============================================================if __name__ == "__main__":&nbsp; &nbsp; from chapter4_llm import HelloAgentsLLM&nbsp; &nbsp; llm = HelloAgentsLLM()&nbsp; &nbsp; agent = ReflectionAgent(llm, max_iterations=2)&nbsp; &nbsp; agent.run("编写一个 Python 函数，找出 1 到 n 之间所有的素数。")

运行效果：

ounter(lineounter(lineounter(lineounter(lineounter(lineounter(lineounter(lineounter(lineounter(lineounter(lineounter(lineounter(lineounter(lineounter(lineounter(lineounter(lineounter(lineounter(lineounter(lineounter(lineounter(lineounter(lineounter(lineounter(lineounter(lineounter(lineounter(lineounter(lineounter(lineounter(lineounter(lineounter(line任务: 编写一个 Python 函数，找出 1 到 n 之间所有的素数。
--- 初始执行 ---（生成了基于试除法的初版，时间复杂度 O(n√n)）记忆已更新 [execution]
--- 第 1/2 轮迭代 ---评审反馈: 当前 O(n√n) 时间复杂度过高，建议改用埃拉托斯特尼筛法，&nbsp; &nbsp; &nbsp; &nbsp; &nbsp;复杂度降至 O(n log log n)。记忆已更新 [reflection]（生成了基于筛法的优化版本）记忆已更新 [execution]
--- 第 2/2 轮迭代 ---评审反馈: 当前筛法已足够高效，在一般情况下无需改进。记忆已更新 [reflection]反思认为代码已足够好，停止迭代。
最终代码:def find_primes(n):&nbsp; &nbsp; """使用埃拉托斯特尼筛法找出 1 到 n 的所有素数。"""&nbsp; &nbsp; if n < 2:&nbsp; &nbsp; &nbsp; &nbsp; return []&nbsp; &nbsp; is_prime = [True] * (n + 1)&nbsp; &nbsp; is_prime[0] = is_prime[1] = False&nbsp; &nbsp; p = 2&nbsp; &nbsp; while p * p <= n:&nbsp; &nbsp; &nbsp; &nbsp; if is_prime[p]:&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; for i in range(p * p, n + 1, p):&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; is_prime[i] = False&nbsp; &nbsp; &nbsp; &nbsp; p += 1&nbsp; &nbsp; return [num for num in range(2, n + 1) if is_prime[num]]

这段代码的三个核心：

① 三套提示词对应三个”角色”初始执行是”程序员”，反思是”严苛的评审员”，优化是”听取意见的程序员”——角色设定的差异决定了 LLM 输出的风格和侧重点。

② Memory 存储完整轨迹add_record 把每次”执行”和”反思”都存起来，get_trajectory 可以把整个历史序列化成文本——这就是模型能”知道自己试过什么”的原因。

③ "无需改进" 作为终止信号让 LLM 自己决定何时停止，而不是死等到最大轮数——这比固定迭代次数更智能，也防止了”优化过度”。

四、三种范式怎么选？

ounter(lineounter(lineounter(lineounter(lineounter(line场景描述 &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;推荐范式────────────────────────────── &nbsp; ──────────────────需要实时搜索、调用 API 获取信息 &nbsp; &nbsp;ReAct（动态探索）任务可以被清晰分解为多个步骤 &nbsp; &nbsp; &nbsp; Plan-and-Solve（结构化执行）对结果质量要求极高，可以接受慢一点 &nbsp;Reflection（迭代优化）

这三种范式并不互斥——在实际项目里，你可能用 Plan-and-Solve 把大任务拆成子任务，每个子任务用 ReAct 去执行，最后用 Reflection 对整体结果做一轮审查。这就是今天主流框架的实际做法。

五、下一步

亲手实现了三种范式之后，你会发现一个共同的痛点：从零搭这些东西，工程量不小——提示词模板、解析器、错误处理、重试逻辑……每次都要重写。

那有没有一种方式，不用写这么多代码，也能快速搭出这些 Agent 流程？

下一期，我们走进低代码平台：Coze、Dify、n8n——用拖拽代替代码，验证一个 Agent 想法只需要几分钟，而不是几天。这对于快速迭代、非技术背景的同学，以及需要把 Agent 接入现有业务系统的场景，价值巨大。

本系列基于 Hello-Agents 开源课程，结合实践整理而成。代码已在 Python 3.10+ 环境验证。本文由 AI 辅助生成。

免责声明：

本文所载程序、技术方法仅面向合法合规的安全研究与教学场景，旨在提升网络安全防护能力，具有明确的技术研究属性。

任何单位或个人未经授权，将本文内容用于攻击、破坏等非法用途的，由此引发的全部法律责任、民事赔偿及连带责任，均由行为人独立承担，本站不承担任何连带责任。

本站内容均为技术交流与知识分享目的发布，若存在版权侵权或其他异议，请通过邮件联系处理，具体联系方式可点击页面上方的联系我。

本文转载自：Crush Sec Claude Code Claude Code《从零玩转 AI Agent —— 4：Agent 到底是怎么”思考”和”行动”的？》