2026-06-18 06:35:10 网络安全文章来源：ZONE.CI 全球网 0 阅读模式

文章总结： 本文详细剖析了AI驱动的渗透测试系统从用户输入到生成报告的完整工作流程，包括API接收、Orchestrator调度、信息收集、ReAct决策循环三大攻击路径执行等核心环节。系统通过LLM动态决策选择工具调用、技能攻击或RAG查询方式，结合MCP客户端在KaliVM执行扫描，并实时通过WebSocket推送漏洞发现进度。文档提供了具体代码锚点与容错机制设计，具备完整的技术复现指导价值。 综合评分： 85 文章分类： 渗透测试,AI安全,安全工具,红队,安全开发

cover_image

第四章 – 重生之我是AI人：解剖主线一次任务从输入到报告

TtTeam

2026年6月17日 12:26 海南

在小说阅读器读本章

去阅读

以下文章来源于威胁情报Z分析，作者Gachong

威胁情报Z分析 .

国际网络安全威胁情报，地缘政治事件分析。

一个故事

用户在 Web UI 输入框敲下：

对 http：//testphp.vulnweb.com 进行渗透测试

接下来 6 分钟里，这套系统里发生了什么？这一章我们按时间顺序走一遍，每一步都给出精确到行号的代码锚点。看完你能复现整个流程，也能定位到任意环节去 debug。

步骤 1：API 接收

文件：api/main.py：87-150

Pydantic 模型定义入参：

class&nbsp;PentestStartRequest(BaseModel)：&nbsp; &nbsp; target：&nbsp;str&nbsp;= Field(...， description="目标 URL 或 IP")&nbsp; &nbsp; scope：&nbsp;list[str] = Field(default=[]， description="授权范围")&nbsp; &nbsp; authorized_by：&nbsp;str&nbsp;= Field(...， description="授权人")&nbsp; &nbsp; task_name：&nbsp;Optional[str] = Field(None， description="任务名称")&nbsp; &nbsp; user_intent：&nbsp;Optional[str] = Field(None， description="用户原始意图")&nbsp; &nbsp; options：&nbsp;Optional[dict] = Field(default={}， description="额外选项")

POST /api/pentest/start 收到请求，做三件事：

生成 UUID 当 task_id
把任务扔进 ThreadPoolExecutor(Windows 友好，见后面降级设计)
立即返回 {task_id， status： “started”} 给前端

为什么不阻塞 HTTP？因为渗透测试跑几分钟，HTTP 早就超时了。

为什么不阻塞 WebSocket？因为任务执行可能在另一台机器、另一个进程。

步骤 2：Orchestrator 启动

文件：agents/orchestrator.py：45-82

Orchestrator 类加载，做延迟初始化：

class&nbsp;Orchestrator：&nbsp; &nbsp;&nbsp;def&nbsp;__init__(self)：&nbsp; &nbsp; &nbsp; &nbsp;&nbsp;self.llm = get_llm() &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;&nbsp;# LLM 客户端&nbsp; &nbsp; &nbsp; &nbsp;&nbsp;self.skill_loader =&nbsp;None&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;# 技能加载器(延迟)&nbsp; &nbsp; &nbsp; &nbsp;&nbsp;self.poc_generator =&nbsp;None&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;&nbsp;# POC 生成器(延迟)&nbsp; &nbsp; &nbsp; &nbsp;&nbsp;self.hexstrike_client =&nbsp;None&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;# MCP 客户端(延迟)
&nbsp; &nbsp;&nbsp;def&nbsp;_init_components(self)：&nbsp; &nbsp; &nbsp; &nbsp;&nbsp;if&nbsp;self.skill_loader&nbsp;is&nbsp;None：&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;&nbsp;from&nbsp;agents.skill_loader&nbsp;import&nbsp;get_skill_loader&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;&nbsp;self.skill_loader = get_skill_loader()&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;&nbsp;self.all_skills =&nbsp;self.skill_loader.list_skills()

关键点：

● LLM 在 __init__ 就加载(每次都要)

● 技能加载、POC 生成、MCP 客户端是延迟的(用时再加载，启动快)

● 全局单例模式：get_skill_loader() 整个进程只调一次

延迟初始化的好处：单元测试时不必真的连 LLM、连 ChromaDB。Mock 一下就行。

步骤 3：Recon 阶段

文件：agents/recon_agent.py

进入 RECON 阶段，调 agents/recon_agent.py 的 HexStrikeClient：

class&nbsp;HexStrikeClient：&nbsp; &nbsp;&nbsp;def&nbsp;__init__(self， base_url=settings.hexstrike_server_url)：&nbsp; &nbsp; &nbsp; &nbsp;&nbsp;self.base_url = base_url
&nbsp; &nbsp;&nbsp;def&nbsp;execute_command(self， command， category="general")：&nbsp; &nbsp; &nbsp; &nbsp;&nbsp;return&nbsp;requests.post(&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; f"{self.base_url}/api/command"，&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; json={"command"： command，&nbsp;"category"： category}，&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; timeout=180&nbsp; &nbsp; &nbsp; &nbsp; ).json()

Hati 发 3 个命令到 Kali VM：

#&nbsp;命令 1：nmap 扫端口nmap -sV -p- http：//testphp.vulnweb.com#&nbsp;返回： {"ports"： [{"port"： 80，&nbsp;"service"：&nbsp;"http"}， {"port"： 3306，&nbsp;"service"：&nbsp;"mysql"}]}
# 命令 2：httpx 识别技术栈httpx -tech-detect -url http：//testphp.vulnweb.com#&nbsp;返回： {"tech"： ["PHP"，&nbsp;"Apache"，&nbsp;"MySQL"]}
# 命令 3：requests 抓首页(主机侧，工具做不了)GET http：//testphp.vulnweb.com HTTP/1.1#&nbsp;返回： HTML 文本

结果写进 state：

state["page_info"] = {&nbsp; &nbsp;&nbsp;"url"：&nbsp;"http：//testphp.vulnweb.com"，&nbsp; &nbsp;&nbsp;"is_login_page"： False，&nbsp; &nbsp;&nbsp;"title"：&nbsp;"Test Page"，&nbsp; &nbsp;&nbsp;"status"：&nbsp;200，&nbsp; &nbsp;&nbsp;"headers"： {...}}state["page_content"] =&nbsp;"..."&nbsp;&nbsp;# HTML 文本前 5000 字符state["open_ports"] = [80，&nbsp;3306]state["tech"] = ["php"，&nbsp;"apache"，&nbsp;"mysql"]

主机侧的 requests：注意 requests 是 Python 内置的，没走 MCP。简单 GET 不值得绕一圈 VM。复杂的工具调用(sqlmap、nuclei)才走 MCP。

步骤 4：ReAct 循环

文件：agents/orchestrator.py：82-400

这是整个项目的大脑。它分两段：

4.1 Think 阶段：LLM 决策

文件：agents/orchestrator.py：82-200

think() 函数被调，做这些事：

把目标信息、技术栈、当前阶段、用户意图打包成 prompt
调 LLM
LLM 输出 JSON 格式决策

核心 prompt 结构：

user_prompt&nbsp;= f"""## 任务信息目标： {target}当前阶段： {current_phase}用户意图： {user_intent}
## 页面分析信息- URL： {page_info.url}- 技术栈： {tech}- 开放端口： {open_ports}
## 决策选项1. execute_tool： &nbsp;调用 MCP 工具2. execute_skill： 执行攻击技能3. query_rag： &nbsp; &nbsp; 查询 RAG 知识库4. generate_poc： &nbsp;生成 POC 验证5. complete： &nbsp; &nbsp; &nbsp;任务完成
请输出 JSON： {{"action"： "..."， "reasoning"： "..."， ...}}"""

LLM 实际输出：

{&nbsp;&nbsp;"action"：&nbsp;"execute_skill"，&nbsp;&nbsp;"skill_name"：&nbsp;"sqli-sql-injection"，&nbsp;&nbsp;"target_info"： {&nbsp; &nbsp;&nbsp;"url"：&nbsp;"http：//testphp.vulnweb.com/artists.php"，&nbsp; &nbsp;&nbsp;"parameter"：&nbsp;"artist"&nbsp; }，&nbsp;&nbsp;"reasoning"：&nbsp;"目标开了 3306 端口，且是 PHP+MySQL，优先测 SQL 注入"}

真实 LLM 输出的 JSON 片段(从 log 里抄出来的)

4.2 Act 阶段：执行决策

文件：agents/orchestrator.py：333-400

act() 函数拿到决策，根据 action 字段分发：

def&nbsp;act(self， state， decision)：&nbsp; &nbsp; action = decision["action"]
&nbsp; &nbsp;&nbsp;if&nbsp;action ==&nbsp;"execute_tool"：&nbsp; &nbsp; &nbsp; &nbsp; state =&nbsp;self._do_attack_surfaces_discovery(state， decision)&nbsp; &nbsp;&nbsp;elif&nbsp;action ==&nbsp;"execute_skill"：&nbsp; &nbsp; &nbsp; &nbsp; state =&nbsp;self._do_skill_based_attack(state， decision)&nbsp; &nbsp;&nbsp;elif&nbsp;action ==&nbsp;"query_rag"：&nbsp; &nbsp; &nbsp; &nbsp; state =&nbsp;self._do_rag_poc_attack(state)&nbsp; &nbsp;&nbsp;elif&nbsp;action ==&nbsp;"generate_poc"：&nbsp; &nbsp; &nbsp; &nbsp; state =&nbsp;self._do_generate_and_test_poc(state)&nbsp; &nbsp;&nbsp;elif&nbsp;action ==&nbsp;"complete"：&nbsp; &nbsp; &nbsp; &nbsp; state = advance_phase(state， PentestPhase.COMPLETE)

有意思的细节：解析决策那里有容错。

文件：agents/orchestrator.py：312-331

def&nbsp;_parse_decision(self， response)：&nbsp; &nbsp;&nbsp;# 优先解析 JSON&nbsp; &nbsp; json_match = re.search(r'\{.*\}'， response， re.DOTALL)&nbsp; &nbsp;&nbsp;if&nbsp;json_match：&nbsp; &nbsp; &nbsp; &nbsp;&nbsp;try：&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;&nbsp;return&nbsp;json.loads(json_match.group())&nbsp; &nbsp; &nbsp; &nbsp;&nbsp;except&nbsp;json.JSONDecodeError：&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;&nbsp;pass
&nbsp; &nbsp;&nbsp;# JSON 失败 → 关键词回退&nbsp; &nbsp; response_lower = response.lower()&nbsp; &nbsp;&nbsp;if&nbsp;"auth_bypass"&nbsp;in&nbsp;response_lower：&nbsp; &nbsp; &nbsp; &nbsp;&nbsp;return&nbsp;{"action"：&nbsp;"auth_bypass"，&nbsp;"reasoning"：&nbsp;"认证测试"}&nbsp; &nbsp;&nbsp;elif&nbsp;"skill_based"&nbsp;in&nbsp;response_lower：&nbsp; &nbsp; &nbsp; &nbsp;&nbsp;return&nbsp;{"action"：&nbsp;"skill_based_attack"，&nbsp;"reasoning"：&nbsp;"基于技能攻击"}&nbsp; &nbsp;&nbsp;# ...

为啥要容错？LLM 偶尔会输出”我建议你用 X” 这种自然语言，而不是严格 JSON。回退到关键词匹配至少能跑通，不至于一次失败整个任务崩了。

4.3 三种行动路径展开

路径 A：execute_skill

# Orchestrator 把 LLM 决策交给 SkillLoader 匹配matched = self.skill_loader.match_skills(state["page_info"])# 返回： [{"name"： "sqli-sql-injection"， "score"： 5}， ...]
# 然后 LLM 把技能模板适配成具体 HTTP 请求adapted = self._llm_adapt_skill_to_target(skill_content， target)# 模板： "id 参数后加 ' OR '1'='1"# 改写： "GET /artists.php？artist=1' OR '1'='1"
# 实际发包response = requests.get(adapted)# 看响应里有没有 SQL 错误关键词if&nbsp;"SQL syntax"&nbsp;in response.text or&nbsp;"MySQL"&nbsp;in response.text：&nbsp; &nbsp; vuln = {"type"：&nbsp;"SQL Injection"，&nbsp;"severity"：&nbsp;"high"， ...}&nbsp; &nbsp; state["vulnerabilities"].append(vuln)

路径 B：execute_tool

# 直接调 MCP 工具result = self.hexstrike.execute_command(&nbsp; &nbsp;&nbsp;"nuclei -u http：//target.com -t cves/"，&nbsp; &nbsp; category="scanner")# MCP 在 VM 里跑 nuclei，把结果返回

路径 C：query_rag

# 从 RAG 知识库查相关 POCresults = self.rag.query("Apache Struts2 RCE"， n_results=3)# 返回： 3 个最相似的历史 POC
# LLM 改写 POC 适配目标poc = self._llm_adapt_and_test_poc(results， state["target"])
# 实际发包验证response = requests.post(poc["url"]， data=poc["payload"])if poc["success_indicator"] in response.text：&nbsp; &nbsp; vuln = {...}

三种路径都会汇聚到同一个点：把发现的漏洞塞进 state[“vulnerabilities”]。

步骤 5：漏洞入栈 + 进度推送(贯穿整个 ReAct)

文件：state/progress_tracker.py + api/websocket.py

每发现一个漏洞、每切换一个阶段，都会：

写一条进度到 Redis(progress_tracker.py)
WebSocket 推送给前端(websocket.py)

WebSocket 消息类型：

{type：&nbsp;"progress"， phase：&nbsp;"vuln_scan"， step：&nbsp;"正在执行技能 sqli"}{type：&nbsp;"vuln_found"， vulnerability： {...}}{type：&nbsp;"ai_token"， content：&nbsp;"我决定"}{type：&nbsp;"complete"， report_url：&nbsp;"/reports/xxx.md"}

WebSocket 真实消息流(从浏览器 devtools 网络面板抓的)

步骤 6：报告生成(ReAct 结束后)

文件：agents/report_agent.py(512 行)

ReAct 跑满 8 轮或 LLM 输出 “complete” → 进入 REPORT 阶段。

state[“vulnerabilities”] 此时长这样：

{&nbsp;&nbsp;"vulnerabilities"： [&nbsp; &nbsp; {&nbsp; &nbsp; &nbsp;&nbsp;"id"：&nbsp;"vuln_001"，&nbsp; &nbsp; &nbsp;&nbsp;"name"：&nbsp;"SQL Injection in artist parameter"，&nbsp; &nbsp; &nbsp;&nbsp;"severity"：&nbsp;"high"，&nbsp; &nbsp; &nbsp;&nbsp;"cvss_score"：&nbsp;8.5，&nbsp; &nbsp; &nbsp;&nbsp;"target"：&nbsp;"http：//testphp.vulnweb.com"，&nbsp; &nbsp; &nbsp;&nbsp;"url"：&nbsp;"/artists.php？artist=1"，&nbsp; &nbsp; &nbsp;&nbsp;"parameter"：&nbsp;"artist"，&nbsp; &nbsp; &nbsp;&nbsp;"evidence"： {&nbsp; &nbsp; &nbsp; &nbsp;&nbsp;"request"：&nbsp;"GET ...？artist=1' OR '1'='1"，&nbsp; &nbsp; &nbsp; &nbsp;&nbsp;"response_snippet"：&nbsp;"You have an error in your SQL syntax"&nbsp; &nbsp; &nbsp; }，&nbsp; &nbsp; &nbsp;&nbsp;"poc_path"：&nbsp;"reports/20260601_xxx_poc_001.txt"，&nbsp; &nbsp; &nbsp;&nbsp;"status"：&nbsp;"confirmed"&nbsp; &nbsp; }&nbsp; ]}

ReportAgent 调一次 LLM，把结构化数据写成 Markdown。

输出：reports/{date}_{time}_{target}_{task_id}.{md，json} 双格式。

步骤 7：WebSocket 推完成

最后一帧 WebSocket 消息：

{&nbsp;&nbsp;type：&nbsp;"complete"，&nbsp; report_md_url：&nbsp;"/reports/20260601_143022_target_a1b2c3d4.md"，&nbsp; report_json_url：&nbsp;"/reports/20260601_143022_target_a1b2c3d4.json"，&nbsp; duration_seconds： 372，&nbsp; vulnerabilities_count： 3}

Web UI 收到后，显示报告链接，任务结束。

流程时序图

一次任务的完整时间线

免责声明：

本文所载程序、技术方法仅面向合法合规的安全研究与教学场景，旨在提升网络安全防护能力，具有明确的技术研究属性。

任何单位或个人未经授权，将本文内容用于攻击、破坏等非法用途的，由此引发的全部法律责任、民事赔偿及连带责任，均由行为人独立承担，本站不承担任何连带责任。

本站内容均为技术交流与知识分享目的发布，若存在版权侵权或其他异议，请通过邮件联系处理，具体联系方式可点击页面上方的联系我。

本文转载自：TtTeam 《第四章 – 重生之我是AI人：解剖主线一次任务从输入到报告》