2026-05-03 04:52:45 网络安全文章来源：ZONE.CI 全球网 0 阅读模式

文章总结： 本文通过对比Mozilla/Anthropic的Mythos模型与Hacktron在漏洞挖掘中的实践，分析了自动化漏洞挖掘系统的真实瓶颈。文章指出前沿模型虽在复杂目标（如Firefox）中展现强大能力，但商业化扫描更需关注成本-召回率平衡。通过oauth2-proxy两个认证绕过漏洞案例，强调系统需结合模型能力、上下文构建、搜索覆盖与验证闭环，而非单纯依赖模型强度。 综合评分： 87 文章分类： 漏洞分析,AI安全,安全工具,解决方案,安全运营

cover_image

模型不是全部：从 Mythos、Hacktron 与 oauth2-proxy 0-day 看自动化漏洞挖掘系统的真实瓶颈

做安全的小明同学做安全的小明同学

大山子雪人

2026年5月2日 08:53 北京

在小说阅读器读本章

去阅读

模型不是全部：从 Mythos、Hacktron 与 oauth2-proxy 0-day 看自动化漏洞挖掘系统的真实瓶颈

副标题：Hacktron《Why Mythos Doesn’t Matter (for us)》深度对比学习笔记核心主题：Mythos / Claude / Hacktron / oauth2-proxy / 上下文工程 / 成本-召回率 / 验证闭环 / 漏洞挖掘 Agent 架构文档类型：多源合并、深度对比、架构学习更新时间：2026-05-02

0. 阅读目标

这份文档不是对 Hacktron 原文做简单转载，也不是逐条摘录参考链接，而是把 Hacktron 原文及其引用资料合并为一套可学习、可复用的安全研究框架。

它重点回答四个问题：

1. Mythos / Claude 这类 frontier model 在漏洞发现中到底证明了什么？
2. Hacktron 为什么认为“对它们而言 Mythos 不重要”？
3. oauth2-proxy 两个认证绕过漏洞说明了什么样的上下文工程问题？
4. 对 OpenClaw 这类漏洞挖掘 Agent 来说，应该如何把这些材料转化为系统设计？

一句话总结：

这些材料共同说明，未来的漏洞挖掘能力不只来自更强模型，而来自“模型能力 + 上下文构建 + 搜索覆盖 + 成本控制 + 证据验证 + 人类经验约束”的系统组合。

1. 资料来源总览

2. 事件时间线

| 时间 | 事件 | 关键含义 | | — | — | — | | 2026-03-06 | Anthropic 发布 Mozilla / Firefox 合作文章，称 Claude Opus 4.6 两周内发现 22 个 Firefox 漏洞，其中 14 个被 Mozilla 评为 high severity | 证明 frontier model 已经能在复杂浏览器代码库中发现高价值漏洞 | | 2026-04-14 | oauth2-proxy 发布 v7.15.2，修复多个安全问题，包括两个 critical 认证绕过 | Hacktron benchmark 的两个 ground truth 漏洞进入公开 advisory | | 2026-04-21 | Mozilla 发布 The zero-days are numbered，称 Firefox 150 包含 271 个 Mythos Preview 初始评估发现的漏洞修复 | Mythos 进一步提升了 AI-assisted vulnerability discovery 的规模感 | | 2026-04-29 | Hacktron 发布 Why Mythos doesn’t matter (for us) | 从产品化、无人值守、成本-召回率角度重新定义模型选择问题 |

3. 三条主线：Mozilla / Anthropic / Hacktron

3.1 Mozilla 主线：AI 让防守方有机会系统性清空历史漏洞库存

Mozilla 文章的核心叙事是：Firefox 团队自 2026 年 2 月起使用 frontier AI models 查找并修复浏览器中的潜在安全漏洞。Mozilla 表示，Firefox 150 中包含 271 个由 Claude Mythos Preview 初始评估发现的漏洞修复；在更早的合作中，Opus 4.6 对 Firefox 148 的扫描促成了 22 个 security-sensitive bugs 的修复。

Mozilla 的重要观点包括：

• 传统安全长期处于攻防拉锯状态。
• 攻击者只需要找到一个薄弱点，而防守者需要覆盖大量攻击面。
• fuzzing 很有效，但覆盖不均衡。
• 顶级安全研究员能通过源码推理找到 fuzzing 难以覆盖的问题，但这种能力稀缺且昂贵。
• frontier AI models 开始具备接近 elite human researcher 的源码推理能力。
• Mozilla 没有看到“人类顶级研究员能发现而模型不能发现”的漏洞类别或复杂度。

这里的学习重点不是“AI 无所不能”，而是：

当模型足够强，并且目标代码库、维护团队、修复流程和验证能力都具备时，AI 可以显著压缩漏洞发现时间。

3.2 Anthropic 主线：从历史 CVE 复现到真实复杂代码库发现

Anthropic 的 Firefox 安全合作文章强调了几个事实：

• Claude Opus 4.6 在两周内发现 22 个 Firefox 漏洞。
• Mozilla 将其中 14 个评为 high severity。
• Firefox 被选作目标，是因为它是复杂、经过长期测试、用户规模巨大的开源软件。
• Anthropic 先用历史 Firefox CVE 构建评测集，测试模型是否能复现既有漏洞，再进入真实未知漏洞发现。
• Mozilla 在合作中帮助判断哪些 findings 值得提交 bug report，并最终修复问题。

这条主线说明：

高质量 AI 漏洞研究不是“模型直接输出漏洞”这么简单，而是包含历史 CVE benchmark、真实代码扫描、维护者 triage、漏洞报告、修复发布等完整协作流程。

3.3 Hacktron 主线：无人值守商业化扫描的核心指标不是单次模型能力，而是 cost-to-signal

Hacktron 原文承认 frontier model 非常强，尤其是在 expert human operator 参与、目标明确、harness 设计良好、研究员能中途纠偏的场景中。

但 Hacktron 关注的是另一个场景：

• 没有熟练 operator。
• 没有人告诉 Agent 应该看哪里。
• 没有人动态纠偏。
• 需要面向大量普通代码库、PR 或 Web 应用持续运行。
• 成本会随着规模迅速放大。

因此 Hacktron 提出：

对 99% 的应用而言，较小模型重复运行，在成本-召回率上可能比一次性运行 frontier model 更合理。

这个观点的本质不是否定大模型，而是把评价指标从“单次最强能力”切换为：

单位成本下发现真实漏洞的概率 × 单位时间可覆盖代码范围 × 后续验证成本 × 用户可接受的信号质量

4. 核心对比：三类漏洞挖掘模式

5. Hacktron 原文的关键论证链

Hacktron 的论证可以拆成 7 步：

1. Mythos / Claude 在 Firefox 等复杂目标中的成果是真实且重要的。
2. 但这些成果背后通常有熟练 operator 或高质量维护团队参与。
3. Hacktron 目标是无人值守、持续化、产品化漏洞扫描。
4. 在这种场景中，frontier model 的成本会快速放大。
5. LLM 漏洞发现本身具有非确定性，强模型也不是 100% 命中。
6. 如果小模型足够便宜，可以通过多次运行提升覆盖率和召回率。
7. 因此，对大多数应用而言，优化 workflow + 多次运行小模型，可能比依赖 Mythos 更有性价比。

这条论证的关键转折点在第 5 步：

只要大模型也不是确定性命中，那么“多次运行 + 多策略搜索”就具有工程价值。

6. oauth2-proxy benchmark：为什么这个案例很重要

Hacktron 使用 oauth2-proxy v7.15.0 作为 benchmark 目标，并用两个真实 0-day 作为 ground truth：

1. Finding A / CVE-2026-34457 / GHSA-5hvv-m4w4-gf6v Health Check User-Agent Matching Bypasses Authentication in auth_request Mode
2. Finding B / CVE-2026-40575 / GHSA-7×63-xv5r-3p2x Authentication Bypass via X-Forwarded-Uri Header Spoofing

这个 benchmark 的价值在于：

• 两个漏洞都不是简单语法 bug。
• 都依赖部署配置。
• 都发生在 OAuth2 Proxy 与反向代理 / 上游服务的信任边界处。
• 都需要理解“认证代理组件在整体架构中的角色”。
• 都是 security semantics bug，而不是单纯的内存安全 bug。

因此它特别适合作为漏洞挖掘 Agent 的测试目标：

它要求 Agent 不只是看代码，还要理解配置、部署模式、HTTP header 信任边界、auth_request 语义和攻击者可控输入。

7. Finding A 深入学习：Health Check User-Agent 认证绕过

7.1 漏洞事实

GitHub Advisory 对 GHSA-5hvv-m4w4-gf6v 的描述是：OAuth2 Proxy 存在配置依赖型认证绕过。受影响条件包括：

• OAuth2 Proxy 使用 auth_request 风格集成，例如 nginx auth_request。
• 配置了 --ping-user-agent，或启用了 --gcp-healthchecks。

在受影响配置中，OAuth2 Proxy 会把带有特定 health check User-Agent 的请求当成成功健康检查，而不考虑请求路径。攻击者可以构造相同 User-Agent，使 OAuth2 Proxy 返回成功，从而在 auth_request 模式下绕过认证并访问受保护上游资源。

7.2 漏洞成立的语义链

这个漏洞不是看到 User-Agent == GoogleHC/1.0 就能得出结论。它需要下面的语义链：

攻击者可控请求
&nbsp; ↓
攻击者设置 User-Agent 为健康检查 UA
&nbsp; ↓
OAuth2 Proxy 在 health check 分支返回成功
&nbsp; ↓
在 standalone reverse proxy 模式下，这可能只是健康检查行为
&nbsp; ↓
但在 nginx auth_request / middleware 模式下，2xx/200 被上游反向代理解释为“认证通过”
&nbsp; ↓
上游反向代理允许原始请求访问受保护资源
&nbsp; ↓
形成认证绕过

7.3 为什么上下文决定能否发现

Hacktron 原文特别强调 Finding A：如果 prompt 中没有部署模式上下文，模型很容易把 health check 逻辑判断为 harmless dead end。因为在 standalone mode 下，健康检查返回成功并不一定构成漏洞。

只有当上下文中出现以下信息时，漏洞才容易被模型理解：

component_role:
&nbsp;&nbsp;value:OAuth2Proxymayrunasmiddlewareinexistinginfrastructure
required_for_vulnerability:true

delegated_auth_semantics:
value:reverseproxymaytreat2xxresponsefromauth_requestasauthsuccess
required_for_vulnerability:true

attacker_control:
value:clientcaninfluenceUser-Agentunlessupstreamoverwritesit
required_for_vulnerability:true

configuration_condition:
value:
&nbsp; &nbsp;&nbsp;---ping-user-agentconfigured
&nbsp; &nbsp;&nbsp;-or--gcp-healthchecksenabled
required_for_vulnerability:true

impact:
value:unauthenticatedaccesstoprotectedupstreamresources
required_for_vulnerability:&nbsp;true

7.4 对 Agent 的学习点

Finding A 说明：

漏洞发现 Agent 必须区分“代码局部行为”和“部署组合后的安全语义”。

如果 Agent 只分析函数局部，它可能看到的是：

健康检查返回 200，正常。

如果 Agent 能看到系统语义，它应该看到的是：

健康检查返回 200 被另一个组件当作认证成功信号，且攻击者能伪造触发条件，因此构成跨组件认证绕过。

8. Finding B 深入学习：X-Forwarded-Uri Header Spoofing

8.1 漏洞事实

GHSA-7×63-xv5r-3p2x 描述的是另一个配置依赖型认证绕过。受影响条件包括：

• OAuth2 Proxy 配置了 --reverse-proxy。
• 至少定义了一个 --skip_auth_routes 规则，或旧版 --skip-auth-regex。

在受影响情况下，OAuth2 Proxy 可能信任客户端提供的 X-Forwarded-Uri header。攻击者可以伪造这个 header，使 OAuth2 Proxy 针对一个伪造路径执行认证规则或 skip-auth 规则，但实际请求发送给上游的是另一个受保护路径。

8.2 漏洞成立的语义链

攻击者请求受保护路径 /admin
&nbsp; ↓
攻击者添加 X-Forwarded-Uri: /public
&nbsp; ↓
OAuth2 Proxy 在 --reverse-proxy 模式下信任该 header
&nbsp; ↓
skip_auth_routes / skip-auth-regex 依据 /public 判断为可跳过认证
&nbsp; ↓
真实请求仍然访问 /admin 或其他受保护路径
&nbsp; ↓
认证规则判断路径与实际上游路径不一致
&nbsp; ↓
形成认证绕过

8.3 关键安全问题

这个漏洞体现的是典型的 forwarded header trust boundary 问题：

trusted_boundary:
&nbsp;&nbsp;expected:X-Forwarded-*headersshouldcomefromtrustedreverseproxies
actual:client-suppliedX-Forwarded-Urimaybetrusted

path_semantics:
expected:authdecisionpath==upstreamrequestpath
actual:authdecisionpathmaybeattacker-controlledanddifferfromupstreampath

configuration_dependency:
required:
&nbsp; &nbsp;&nbsp;---reverse-proxy
&nbsp; &nbsp;&nbsp;---skip_auth_routesor--skip-auth-regex

security_property_broken:
property:authenticationdecisionmustbemadeontheactualrequestedprotected&nbsp;resource

8.4 修复方向体现的安全原则

advisory 中提到的修复 / 缓解方向包括：

• 升级到 v7.15.2。
• 使用 --trusted-proxy-ip 明确哪些代理可以发送 X-Forwarded-* header。
• 在反向代理或负载均衡层剥离客户端提供的 X-Forwarded-Uri。
• 显式用实际请求 URI 覆盖 X-Forwarded-Uri。
• 限制客户端直接访问 OAuth2 Proxy。
• 收窄或移除 --skip-auth-route / --skip-auth-regex 规则。

这些修复措施可以抽象为一句话：

不要让未认证客户端控制用于认证决策的上下文变量。

9. 两个 oauth2-proxy 漏洞的对比学习

这两个漏洞共同说明：

漏洞不一定藏在复杂算法里，而可能藏在“组件之间如何解释同一个信号”的语义缝隙里。

对漏洞挖掘 Agent 来说，最重要的不是只标记 source/sink，而是识别：

某个输入字段是否参与认证决策？
该字段是否来自攻击者？
该字段是否本应由可信组件生成？
该字段的含义是否在不同组件之间发生错位？
配置是否打开了危险路径？

10. Precision 与 Recall：为什么 Hacktron 更关注 recall benchmark

Hacktron 的 benchmark 先运行 workflow 的前 1-3 步作为初始状态，再用不同模型执行漏洞分析和去重。原文明确说，这一阶段主要评估模型的 recall，因为 validation step 尚未运行。

10.1 Recall

Recall 关注真实漏洞中有多少被系统找出来：

Recall = TP / (TP + FN)

在 Hacktron benchmark 中，两个已知 oauth2-proxy 0-day 是 ground truth。模型多次运行中命中 Finding A / Finding B 的次数，就是 recall 的近似观察。

10.2 Precision

Precision 关注报告出来的 findings 中有多少是真的：

Precision = TP / (TP + FP)

Hacktron 原文把 precision 放到后续 validation step 讨论，因为在真实产品里，用户最终看到的是经过验证、去重、评分后的高信号 findings。

10.3 对漏洞挖掘 Agent 的启发

这一区分非常重要：

发现阶段：宁可多提出候选，优化 recall。
验证阶段：严格裁剪误报，优化 precision。
报告阶段：只输出证据链完整的漏洞。

如果把这三个目标混在一个 Agent run 中，模型往往会出现两种问题：

1. 为了 precision 过早保守，漏掉真实漏洞。
2. 为了 recall 输出大量可疑点，最终无法证明。

成熟系统应该把它们拆开：

stages:
&nbsp;&nbsp;candidate_discovery:
&nbsp; &nbsp;&nbsp;objective:maximize_recall
&nbsp; &nbsp;&nbsp;acceptable_noise:high

candidate_validation:
&nbsp; &nbsp;&nbsp;objective:maximize_precision
&nbsp; &nbsp;&nbsp;acceptable_noise:low

exploitability_proof:
&nbsp; &nbsp;&nbsp;objective:build_complete_evidence_chain
&nbsp; &nbsp;&nbsp;acceptable_noise:&nbsp;very_low

11. Hacktron workflow 与 Claude Code Review workflow 对比

11.1 Hacktron workflow

Hacktron 原文描述的 workflow 包括：

1. 代码解析和调用图构建。
2. 文档、配置和相关资料收集与索引。
3. 代码路径 enrichment，包括数据流、认证边界和控制流特征。
4. 将 enriched context 组装为 targeted prompts。
5. findings 去重、验证、评分。
6. 保留最小化人工 triage，以便调优系统和输出高信号结果。

11.2 Claude Code Review workflow

Anthropic Claude Code Review 文档描述了一个 PR review 系统：

• 多个 specialized agents 并行分析 diff 和 surrounding code。
• 每个 agent 查找不同类别的问题。
• verification step 检查 candidate 是否符合实际代码行为。
• 结果会 deduplicate、rank by severity，并以内联评论方式发布到 PR。
• 可通过 CLAUDE.md 或 REVIEW.md 调整审查关注点。

11.3 对比

11.4 共同模式

这两者共同体现了一个趋势：

LLM 安全系统正在从“单 Agent 问答”演进为“上下文预处理 + 多 Agent 分工 + 候选验证 + 结果排序”的工程系统。

12. Trail of Bits Audit context-building skill 的启发

Hacktron 引用 Trail of Bits 的 Audit context-building skill，说明它并不是简单让 Claude Code 自由探索，而是给了一个安全审计上下文构建能力。

对 Agent 架构而言，这类 skill 的意义在于：

把人类安全研究员在审计前会做的准备动作，固化为可重复调用的上下文构建流程。

典型动作包括：

• 理解项目结构。
• 找入口点。
• 找配置文件。
• 找安全边界。
• 找鉴权逻辑。
• 找外部输入。
• 找敏感操作。
• 生成审计路线图。

这与 OpenClaw 的经验沉淀方向高度一致：

audit_context_builder:
&nbsp;&nbsp;inputs:
&nbsp; &nbsp;&nbsp;-repository
&nbsp; &nbsp;&nbsp;-target_security_theme
&nbsp; &nbsp;&nbsp;-known_vulnerability_pattern
outputs:
&nbsp; &nbsp;&nbsp;-project_map
&nbsp; &nbsp;&nbsp;-entrypoints
&nbsp; &nbsp;&nbsp;-auth_boundaries
&nbsp; &nbsp;&nbsp;-trust_boundaries
&nbsp; &nbsp;&nbsp;-source_sink_pairs
&nbsp; &nbsp;&nbsp;-config_dependent_paths
&nbsp; &nbsp;&nbsp;-&nbsp;evidence_requirements

13. OpenRouter AI Model Rankings 的作用：模型选择不是信仰问题，而是路由问题

OpenRouter rankings 提供的是模型使用和比较视角。它本身不直接证明某个模型更适合漏洞挖掘，但能给工程系统一个重要提示：

模型生态变化很快，固定依赖某一个模型不是长期最优策略。

Hacktron 原文提到 Gemini 3.1 Flash Lite 表现异常好，并提出两个可能原因：

1. Hacktron 的 prompts 在开发过程中可能无意间对 Flash 更友好，因为调试时常用低成本模型。
2. Gemini 3.1 Flash Lite 本身可能就是很好的模型。

这说明模型路由应该是动态的：

model_routing_policy:
&nbsp;&nbsp;discovery_stage:
&nbsp; &nbsp;&nbsp;prefer:
&nbsp; &nbsp; &nbsp;&nbsp;-low_cost
&nbsp; &nbsp; &nbsp;&nbsp;-high_throughput
&nbsp; &nbsp; &nbsp;&nbsp;-acceptable_reasoning
&nbsp; &nbsp; &nbsp;&nbsp;-supports_parallel_runs

adjudication_stage:
&nbsp; &nbsp;&nbsp;prefer:
&nbsp; &nbsp; &nbsp;&nbsp;-strong_reasoning
&nbsp; &nbsp; &nbsp;&nbsp;-long_context
&nbsp; &nbsp; &nbsp;&nbsp;-reliable_instruction_following

validation_stage:
&nbsp; &nbsp;&nbsp;prefer:
&nbsp; &nbsp; &nbsp;&nbsp;-tool_use_reliability
&nbsp; &nbsp; &nbsp;&nbsp;-low_hallucination
&nbsp; &nbsp; &nbsp;&nbsp;-deterministic_output_format

exploit_generation_stage:
&nbsp; &nbsp;&nbsp;prefer:
&nbsp; &nbsp; &nbsp;&nbsp;-code_generation_quality
&nbsp; &nbsp; &nbsp;&nbsp;-environment_reasoning
&nbsp; &nbsp; &nbsp;&nbsp;-&nbsp;test_feedback_iteration

模型选择不应该是：

哪个模型最强就永远用哪个。

而应该是：

当前 stage 的目标是什么？
当前上下文是否完整？
当前任务更需要 recall、precision、推理深度还是低成本探索？

14. Mythos 事件与 Hacktron 文章并不矛盾

表面上看，Mozilla / Anthropic 证明了 frontier model 很强，而 Hacktron 说 Mythos 对自己“不重要”。实际上二者并不矛盾。

可以这样统一理解：

Mythos 证明了模型能力上限正在快速提高。
Hacktron 证明了产品化漏洞挖掘不能只看能力上限，还要看单位成本下的稳定信号产出。

15. 深度对比：模型能力 vs 上下文工程

15.1 模型能力解决什么问题

强模型擅长：

• 长链路源码推理。
• 从历史模式中识别相似漏洞。
• 解释复杂控制流。
• 生成候选漏洞假设。
• 进行跨文件关联。
• 对异常代码模式做安全解释。

15.2 上下文工程解决什么问题

上下文工程负责：

• 告诉模型目标项目是什么。
• 告诉模型组件在系统架构中的角色。
• 告诉模型配置如何影响代码路径。
• 告诉模型哪些输入是攻击者可控的。
• 告诉模型哪些输出被其他组件解释为安全信号。
• 告诉模型当前要找哪类漏洞。
• 告诉模型哪些证据字段必须完整。

15.3 为什么上下文工程可能比模型大小更重要

Finding A 就是典型例子。没有部署模式上下文时，强模型可能把 health check 视为 harmless。上下文完整时，小模型也可能理解：health check success 在 auth_request 模式下会变成认证成功信号。

因此可以得到一个重要公式：

漏洞发现有效性 = 模型推理能力 × 上下文完整度 × 搜索策略 × 验证能力

如果上下文完整度接近 0，再强的模型也可能在错误问题上推理。

16. 深度对比：单次大模型 vs 多次小模型

16.1 单次大模型的优势

• 单次推理更强。
• 对复杂因果链更稳。
• 对长上下文和隐含语义更敏感。
• 在高价值目标上更值得投入。

16.2 多次小模型的优势

• 成本低。
• 可以并行。
• 可以多策略探索。
• 可以覆盖不同入口点、不同漏洞类型、不同 threat model。
• 能用重复运行抵消 LLM 非确定性。

16.3 真正有效的不是“重复同一个 prompt”

低成本模型多次运行不应该只是重复问同一个问题，而应该变成策略矩阵：

multi_run_strategy:
&nbsp;&nbsp;dimensions:
&nbsp; &nbsp;&nbsp;entrypoint_focus:
&nbsp; &nbsp; &nbsp;&nbsp;-http_handlers
&nbsp; &nbsp; &nbsp;&nbsp;-middleware
&nbsp; &nbsp; &nbsp;&nbsp;-config_parsing
&nbsp; &nbsp; &nbsp;&nbsp;-proxy_headers
&nbsp; &nbsp; &nbsp;&nbsp;-health_checks

&nbsp; &nbsp;&nbsp;vulnerability_focus:
&nbsp; &nbsp; &nbsp;&nbsp;-auth_bypass
&nbsp; &nbsp; &nbsp;&nbsp;-path_confusion
&nbsp; &nbsp; &nbsp;&nbsp;-header_spoofing
&nbsp; &nbsp; &nbsp;&nbsp;-trust_boundary_violation
&nbsp; &nbsp; &nbsp;&nbsp;-configuration_dependent_behavior

&nbsp; &nbsp;&nbsp;reasoning_direction:
&nbsp; &nbsp; &nbsp;&nbsp;-source_to_sink
&nbsp; &nbsp; &nbsp;&nbsp;-sink_to_source
&nbsp; &nbsp; &nbsp;&nbsp;-config_to_code_path
&nbsp; &nbsp; &nbsp;&nbsp;-deployment_mode_to_auth_semantics

&nbsp; &nbsp;&nbsp;attacker_model:
&nbsp; &nbsp; &nbsp;&nbsp;-unauthenticated_remote
&nbsp; &nbsp; &nbsp;&nbsp;-authenticated_low_privilege
&nbsp; &nbsp; &nbsp;&nbsp;-malicious_reverse_proxy_client
&nbsp; &nbsp; &nbsp;&nbsp;-&nbsp;direct_access_to_internal_endpoint

这样，小模型多次运行才是真正的“搜索覆盖扩展”，而不是随机重复。

17. 深度对比：漏洞发现 vs 漏洞证明

很多 AI 安全系统容易混淆两个阶段：

发现一个可疑点 ≠ 证明一个漏洞成立

17.1 漏洞发现阶段

目标是提出候选假设：

candidate:
&nbsp;&nbsp;type:auth_bypass
suspicious_code:health_check_user_agent_match
possible_attacker_control:User-Agent
possible_impact:auth_requestsuccess
confidence:&nbsp;medium

17.2 漏洞证明阶段

目标是形成证据链：

evidence_chain:
&nbsp;&nbsp;attacker_control:
&nbsp; &nbsp;&nbsp;field:User-Agent
&nbsp; &nbsp;&nbsp;proof:reverseproxyforwardsclientUser-Agenttoauthsubrequest

vulnerable_condition:
&nbsp; &nbsp;&nbsp;config:
&nbsp; &nbsp; &nbsp;&nbsp;---ping-user-agent
&nbsp; &nbsp; &nbsp;&nbsp;-auth_requestmode
&nbsp; &nbsp;&nbsp;proof:deploymentconfigordocumentedintegration

security_decision:
&nbsp; &nbsp;&nbsp;component:OAuth2Proxy
&nbsp; &nbsp;&nbsp;behavior:returnssuccessformatchinghealth-checkUser-Agent

cross_component_semantics:
&nbsp; &nbsp;&nbsp;component:nginxauth_request
&nbsp; &nbsp;&nbsp;behavior:treats2xxasauthenticationsuccess

impact:
&nbsp; &nbsp;&nbsp;resource:protectedupstreamroute
&nbsp; &nbsp;&nbsp;result:unauthenticatedaccess

reproducibility:
&nbsp; &nbsp;&nbsp;poc_status:required
&nbsp; &nbsp;&nbsp;expected_response:upstreamprotectedcontentwithout&nbsp;login

17.3 对 OpenClaw 的关键启发

OpenClaw 不应该把 LLM 输出当成最终 verdict，而应把它当成 hypothesis，然后进入证据补全状态机：

states:
&nbsp;&nbsp;HYPOTHESIS_CREATED:
&nbsp; &nbsp;&nbsp;next:EVIDENCE_GAP_ANALYSIS

EVIDENCE_GAP_ANALYSIS:
&nbsp; &nbsp;&nbsp;actions:
&nbsp; &nbsp; &nbsp;&nbsp;-identify_missing_required_fields
&nbsp; &nbsp; &nbsp;&nbsp;-classify_missing_fields_as_evidence_or_context
&nbsp; &nbsp;&nbsp;next:CONTEXT_RETRIEVAL

CONTEXT_RETRIEVAL:
&nbsp; &nbsp;&nbsp;actions:
&nbsp; &nbsp; &nbsp;&nbsp;-inspect_config
&nbsp; &nbsp; &nbsp;&nbsp;-inspect_docs
&nbsp; &nbsp; &nbsp;&nbsp;-inspect_reverse_proxy_examples
&nbsp; &nbsp; &nbsp;&nbsp;-inspect_call_graph
&nbsp; &nbsp;&nbsp;next:RE_EVALUATE

RE_EVALUATE:
&nbsp; &nbsp;&nbsp;actions:
&nbsp; &nbsp; &nbsp;&nbsp;-rerun_model_with_filled_context
&nbsp; &nbsp; &nbsp;&nbsp;-challenge_hypothesis
&nbsp; &nbsp;&nbsp;next:VALIDATION

VALIDATION:
&nbsp; &nbsp;&nbsp;actions:
&nbsp; &nbsp; &nbsp;&nbsp;-static_reachability_check
&nbsp; &nbsp; &nbsp;&nbsp;-dynamic_repro
&nbsp; &nbsp; &nbsp;&nbsp;-harness_or_poc
&nbsp; &nbsp;&nbsp;next:&nbsp;SECURITY_VERDICT

18. 从 oauth2-proxy 案例抽象出的漏洞模式

两个漏洞可以抽象为一个更通用的漏洞模式：

pattern:&nbsp;cross_component_auth_semantics_confusion

summary:|
&nbsp; A component makes an authentication or routing decision using metadata that is
&nbsp; trusted in one deployment mode but attacker-controlled or semantically different
&nbsp; in another deployment mode. The decision is then consumed by another component
&nbsp; as an authorization signal.

required_conditions:
-component_is_used_as_auth_middleware_or_reverse_proxy
-security_decision_depends_on_request_metadata
-metadata_can_be_controlled_or_spoofed_by_attacker
-downstream_or_upstream_component_interprets_result_as_auth_success
-protected_resource_is_reachable_without_normal_login

common_sources:
-User-Agent
-X-Forwarded-Uri
-X-Forwarded-Host
-X-Forwarded-Proto
-X-Original-URI
-X-Real-IP
-Forwarded
-requestpath
-querystring
-fragment-likerouterepresentation

common_sinks:
-auth_requestsuccess
-skip-authrulematch
-routeallowlist
-upstreamheaderpropagation
-sessionvalidationbypass
-healthchecksuccess

agent_detection_strategy:
-enumeratedeploymentmodes
-enumeratesecuritydecisionpoints
-identifyrequestmetadatausedindecisions
-classifymetadataprovenance
-comparedecisionpathandactualupstreampath
-simulateattacker-controlledheaders
-checkconfigurationflagsthatenabletrustboundary&nbsp;shifts

19. 对 OpenClaw 的完整架构启发

19.1 总体公式

OpenClaw 漏洞产出能力 =
&nbsp; 安全经验约束
&nbsp; × 上下文构建能力
&nbsp; × 多策略候选搜索
&nbsp; × 模型路由能力
&nbsp; × 证据补全能力
&nbsp; × 工具验证能力
&nbsp; × 成本控制能力

19.2 推荐分层架构

openclaw_architecture:
&nbsp;&nbsp;layer_1_context_engine:
&nbsp; &nbsp;&nbsp;purpose:buildsecurity-relevantcontextbeforeaskingLLM
&nbsp; &nbsp;&nbsp;components:
&nbsp; &nbsp; &nbsp;&nbsp;-code_parser
&nbsp; &nbsp; &nbsp;&nbsp;-call_graph_builder
&nbsp; &nbsp; &nbsp;&nbsp;-config_indexer
&nbsp; &nbsp; &nbsp;&nbsp;-doc_retriever
&nbsp; &nbsp; &nbsp;&nbsp;-deployment_mode_detector
&nbsp; &nbsp; &nbsp;&nbsp;-auth_boundary_mapper
&nbsp; &nbsp; &nbsp;&nbsp;-trust_boundary_mapper

layer_2_candidate_discovery:
&nbsp; &nbsp;&nbsp;purpose:maximizerecallthroughmulti-runstrategies
&nbsp; &nbsp;&nbsp;components:
&nbsp; &nbsp; &nbsp;&nbsp;-small_model_parallel_runs
&nbsp; &nbsp; &nbsp;&nbsp;-vulnerability_pattern_prompts
&nbsp; &nbsp; &nbsp;&nbsp;-entrypoint_focused_passes
&nbsp; &nbsp; &nbsp;&nbsp;-source_sink_exploration
&nbsp; &nbsp; &nbsp;&nbsp;-config_to_code_path_analysis

layer_3_candidate_adjudication:
&nbsp; &nbsp;&nbsp;purpose:reducelogicaljumpsandidentifyevidencegaps
&nbsp; &nbsp;&nbsp;components:
&nbsp; &nbsp; &nbsp;&nbsp;-strong_model_review
&nbsp; &nbsp; &nbsp;&nbsp;-hypothesis_challenge
&nbsp; &nbsp; &nbsp;&nbsp;-missing_evidence_classifier
&nbsp; &nbsp; &nbsp;&nbsp;-alternative_explanation_generator

layer_4_validation_engine:
&nbsp; &nbsp;&nbsp;purpose:converthypothesesintoevidence-backedfindings
&nbsp; &nbsp;&nbsp;components:
&nbsp; &nbsp; &nbsp;&nbsp;-static_reachability
&nbsp; &nbsp; &nbsp;&nbsp;-config_condition_checker
&nbsp; &nbsp; &nbsp;&nbsp;-runtime_harness
&nbsp; &nbsp; &nbsp;&nbsp;-poc_generation
&nbsp; &nbsp; &nbsp;&nbsp;-differential_patch_analysis
&nbsp; &nbsp; &nbsp;&nbsp;-log_capture

layer_5_reporting:
&nbsp; &nbsp;&nbsp;purpose:outputhigh-signalsecurityfindings
&nbsp; &nbsp;&nbsp;components:
&nbsp; &nbsp; &nbsp;&nbsp;-deduplication
&nbsp; &nbsp; &nbsp;&nbsp;-severity_ranking
&nbsp; &nbsp; &nbsp;&nbsp;-exploitability_summary
&nbsp; &nbsp; &nbsp;&nbsp;-remediation_mapping
&nbsp; &nbsp; &nbsp;&nbsp;-&nbsp;reproducibility_artifacts

19.3 Stage 设计建议

stages:
&nbsp;&nbsp;context_building:
&nbsp; &nbsp;&nbsp;input:
&nbsp; &nbsp; &nbsp;&nbsp;-repository
&nbsp; &nbsp; &nbsp;&nbsp;-docs
&nbsp; &nbsp; &nbsp;&nbsp;-configs
&nbsp; &nbsp;&nbsp;output:
&nbsp; &nbsp; &nbsp;&nbsp;-system_model
&nbsp; &nbsp; &nbsp;&nbsp;-component_roles
&nbsp; &nbsp; &nbsp;&nbsp;-deployment_modes
&nbsp; &nbsp; &nbsp;&nbsp;-trust_boundaries

candidate_generation:
&nbsp; &nbsp;&nbsp;input:
&nbsp; &nbsp; &nbsp;&nbsp;-system_model
&nbsp; &nbsp; &nbsp;&nbsp;-vulnerability_patterns
&nbsp; &nbsp;&nbsp;output:
&nbsp; &nbsp; &nbsp;&nbsp;-candidate_findings
&nbsp; &nbsp;&nbsp;optimization_target:recall

candidate_filtering:
&nbsp; &nbsp;&nbsp;input:
&nbsp; &nbsp; &nbsp;&nbsp;-candidate_findings
&nbsp; &nbsp;&nbsp;output:
&nbsp; &nbsp; &nbsp;&nbsp;-plausible_findings
&nbsp; &nbsp;&nbsp;optimization_target:remove_obvious_false_positives

evidence_completion:
&nbsp; &nbsp;&nbsp;input:
&nbsp; &nbsp; &nbsp;&nbsp;-plausible_findings
&nbsp; &nbsp;&nbsp;output:
&nbsp; &nbsp; &nbsp;&nbsp;-evidence_complete_findings
&nbsp; &nbsp;&nbsp;optimization_target:fill_required_security_fields

validation:
&nbsp; &nbsp;&nbsp;input:
&nbsp; &nbsp; &nbsp;&nbsp;-evidence_complete_findings
&nbsp; &nbsp;&nbsp;output:
&nbsp; &nbsp; &nbsp;&nbsp;-verified_findings
&nbsp; &nbsp;&nbsp;optimization_target:precision

report_generation:
&nbsp; &nbsp;&nbsp;input:
&nbsp; &nbsp; &nbsp;&nbsp;-verified_findings
&nbsp; &nbsp;&nbsp;output:
&nbsp; &nbsp; &nbsp;&nbsp;-actionable_report
&nbsp; &nbsp;&nbsp;optimization_target:&nbsp;user_signal_quality

20. 证据字段设计：把“缺字段判断”工程化

你之前提到：“缺字段先判断它是不是证据字段。” 结合 Hacktron / oauth2-proxy 案例，可以把字段分成四类。

20.1 描述字段

缺失不会直接影响漏洞成立，但影响可读性。

descriptive_fields:
&nbsp;&nbsp;-&nbsp;title
&nbsp;&nbsp;-&nbsp;affected_file
&nbsp;&nbsp;-&nbsp;affected_function
&nbsp;&nbsp;-&nbsp;vulnerability_category

20.2 上下文字段

缺失会影响模型判断，但不一定直接证明漏洞。

context_fields:
&nbsp;&nbsp;-&nbsp;project_role
&nbsp;&nbsp;-&nbsp;deployment_mode
&nbsp;&nbsp;-&nbsp;relevant_config_files
&nbsp;&nbsp;-&nbsp;integration_examples
&nbsp;&nbsp;-&nbsp;upstream_components

20.3 证据字段

缺失会导致不能判定漏洞成立。

evidence_fields:
&nbsp;&nbsp;-attacker_control
-vulnerable_condition
-reachable_path
-security_decision_point
-trust_boundary_violation
-impact
-&nbsp;reproducibility

20.4 验证字段

缺失会导致不能交付高置信报告。

validation_fields:
&nbsp;&nbsp;-&nbsp;static_reachability_result
&nbsp;&nbsp;-&nbsp;dynamic_repro_result
&nbsp;&nbsp;-&nbsp;poc_artifact
&nbsp;&nbsp;-&nbsp;expected_vs_actual_behavior
&nbsp;&nbsp;-&nbsp;patched_version_behavior

20.5 字段缺失处理逻辑

missing_field_policy:
&nbsp;&nbsp;descriptive_field_missing:
&nbsp; &nbsp;&nbsp;action:allow_continue

context_field_missing:
&nbsp; &nbsp;&nbsp;action:retrieve_more_context

evidence_field_missing:
&nbsp; &nbsp;&nbsp;action:block_security_verdict

validation_field_missing:
&nbsp; &nbsp;&nbsp;action:&nbsp;mark_as_unverified_or_run_validation

21. 多 Agent 编排学习：从文章到可执行系统

21.1 不推荐的方式

单个 Agent：读取源码 → 问有没有漏洞 → 输出报告

问题：

• 上下文不完整。
• 漏洞类型不聚焦。
• 易受 LLM 非确定性影响。
• 误报和漏报都难控制。
• 缺少证据补全机制。

21.2 推荐方式

agents:
&nbsp;&nbsp;context_builder:
&nbsp; &nbsp;&nbsp;role:constructprojectandsecuritycontext
&nbsp; &nbsp;&nbsp;outputs:
&nbsp; &nbsp; &nbsp;&nbsp;-project_map
&nbsp; &nbsp; &nbsp;&nbsp;-deployment_modes
&nbsp; &nbsp; &nbsp;&nbsp;-auth_boundaries
&nbsp; &nbsp; &nbsp;&nbsp;-trust_boundaries

candidate_hunter:
&nbsp; &nbsp;&nbsp;role:generatevulnerabilityhypotheses
&nbsp; &nbsp;&nbsp;optimization:recall
&nbsp; &nbsp;&nbsp;model_preference:low_cost_parallel

semantic_judge:
&nbsp; &nbsp;&nbsp;role:assesswhethercandidatehasplausiblesecuritysemantics
&nbsp; &nbsp;&nbsp;optimization:reduce_logical_jump
&nbsp; &nbsp;&nbsp;model_preference:strong_reasoning

evidence_gap_analyzer:
&nbsp; &nbsp;&nbsp;role:classifymissingfields
&nbsp; &nbsp;&nbsp;outputs:
&nbsp; &nbsp; &nbsp;&nbsp;-missing_context
&nbsp; &nbsp; &nbsp;&nbsp;-missing_evidence
&nbsp; &nbsp; &nbsp;&nbsp;-missing_validation

validator:
&nbsp; &nbsp;&nbsp;role:proveordisprovecandidate
&nbsp; &nbsp;&nbsp;tools:
&nbsp; &nbsp; &nbsp;&nbsp;-static_analysis
&nbsp; &nbsp; &nbsp;&nbsp;-dynamic_test
&nbsp; &nbsp; &nbsp;&nbsp;-poc_runner

reporter:
&nbsp; &nbsp;&nbsp;role:generatefinalhuman-readablereport
&nbsp; &nbsp;&nbsp;inputs:
&nbsp; &nbsp; &nbsp;&nbsp;-&nbsp;verified_findings

21.3 编排原则

orchestration_principles:
&nbsp;&nbsp;-stage_goal_must_be_explicit
-each_stage_outputs_structured_fields
-evidence_fields_gate_security_verdict
-low_cost_models_for_broad_search
-strong_models_for_semantic_adjudication
-tools_for_final_validation
-repeated_runs_should_cover_different_search_dimensions
-&nbsp;no_final_report_without_complete_evidence_chain

22. 对比学习结论

22.1 从 Mozilla 学到什么

• Frontier model 已经能在高复杂度代码库中发现大量真实漏洞。
• AI 漏洞发现会显著改变攻防成本结构。
• 防守方如果能快速修复、验证和发布，AI 能成为漏洞库存清理工具。
• 人类可理解性仍然是关键，因为安全修复和架构判断仍需要人类维护团队。

22.2 从 Anthropic 学到什么

• 高质量 AI security research 需要 benchmark、历史 CVE 复现、真实目标验证和维护者协作。
• 漏洞发现和漏洞利用能力不是同一件事。
• 模型输出 findings 后，仍需要判断哪些值得报告、哪些需要修复、哪些可能是误报。

22.3 从 Hacktron 学到什么

• 商业化漏洞扫描不能只追求最强模型。
• 成本、召回率、验证成本和信号质量共同决定系统可用性。
• 小模型多次运行可以对冲非确定性，但前提是有高质量 context engine 和 workflow。
• 上下文完整度可以直接决定漏洞是否被发现。

22.4 从 oauth2-proxy 学到什么

• 很多真实高危漏洞来自配置、部署和组件语义错位。
• Agent 必须理解 trust boundary、auth boundary 和 deployment mode。
• Header、路径、健康检查、反向代理语义都是认证绕过的重要输入面。

22.5 从 Claude Code Review 学到什么

• 多 Agent 并行、verification、deduplication、severity ranking 正在成为 AI code review 的标准工程形态。
• 可调的仓库级指导文件，例如 CLAUDE.md / REVIEW.md，本质上就是项目上下文和约束系统。

23. 最终抽象：漏洞挖掘 Agent 的五个核心问题

任何自动化漏洞挖掘 Agent 都必须回答这五个问题：

Q1：我看到的代码行为，在当前部署模式下是否具有安全语义？

例如 health check 返回 200，在 standalone 模式下可能无害，在 auth_request 模式下可能是认证绕过。

Q2：参与安全决策的数据是否来自攻击者？

例如 User-Agent、X-Forwarded-Uri、路径、host、scheme、IP、proxy header。

Q3：这个数据是否本应由可信组件生成？

例如 X-Forwarded-* 通常应由可信反向代理写入，而不是由客户端直接提供。

Q4：安全决策依据的对象是否和实际访问对象一致？

例如 OAuth2 Proxy 用 /public 做 skip-auth 判断，但真实上游访问的是 /admin。

Q5：是否能形成完整证据链？

包括攻击者控制、配置条件、可达路径、安全决策点、跨组件语义、影响、复现或验证。

24. 可直接复用的 OpenClaw Prompt Skeleton

你是一个漏洞挖掘 Agent，不要直接输出漏洞结论。请按以下阶段分析目标代码：

阶段 1：上下文建模
- 项目是什么？
- 它在系统中扮演什么角色？
- 是否作为 reverse proxy / auth middleware / gateway / broker / service mesh 组件？
- 有哪些部署模式？
- 哪些配置会改变安全语义？

阶段 2：安全边界识别
- 外部输入有哪些？
- 哪些 header / path / query / body / config 参与认证或授权决策？
- 哪些字段本应来自可信代理或内部组件？
- 哪些字段可能被客户端伪造？

阶段 3：候选漏洞生成
- 找 auth bypass、path confusion、header spoofing、trust boundary violation、config-dependent bypass。
- 每个候选只输出结构化字段，不要直接下最终 verdict。

阶段 4：证据字段检查
对每个候选检查：
- attacker_control
- vulnerable_condition
- reachable_path
- security_decision_point
- trust_boundary_violation
- impact
- reproducibility

阶段 5：证据缺口处理
- 如果缺少 context field，列出需要检索的代码/文档/配置。
- 如果缺少 evidence field，禁止判定为 confirmed vulnerability。
- 如果缺少 validation field，标记为 unverified candidate。

阶段 6：最终输出
只输出证据链完整的 confirmed findings。
对未完成证据链的候选，输出 missing evidence，而不是漏洞结论。

25. 参考链接

1. Hacktron, Why Mythos doesn’t matter (for us) https://www.hacktron.ai/blog/why-mythos-doesnt-matter-for-us
2. Mozilla Security Blog, The zero-days are numbered https://blog.mozilla.org/en/privacy-security/ai-security-zero-day-vulnerabilities/
3. Anthropic, Partnering with Mozilla to improve Firefox’s security https://www.anthropic.com/news/mozilla-firefox-security
4. GitHub Advisory, GHSA-5hvv-m4w4-gf6v https://github.com/oauth2-proxy/oauth2-proxy/security/advisories/GHSA-5hvv-m4w4-gf6v
5. GitHub Advisory, GHSA-7×63-xv5r-3p2x https://github.com/oauth2-proxy/oauth2-proxy/security/advisories/GHSA-7×63-xv5r-3p2x
6. Anthropic Claude Code Review documentation https://code.claude.com/docs/en/code-review
7. OpenRouter AI Model Rankings https://openrouter.ai/rankings
8. oauth2-proxy repository / README https://github.com/oauth2-proxy/oauth2-proxy
9. GitHub Advisory Database, GHSA-5hvv-m4w4-gf6v https://github.com/advisories/GHSA-5hvv-m4w4-gf6v
1. GitHub Advisory Database, GHSA-7×63-xv5r-3p2x https://github.com/advisories/GHSA-7×63-xv5r-3p2x
1. Wikipedia, Precision and recall https://en.wikipedia.org/wiki/Precision_and_recall
1. LinkedIn public search result context for Hacktron / oauth2-proxy 0-days https://www.linkedin.com/company/hacktron

26. 一句话结论

Mythos 代表模型能力上限，Hacktron 代表产品化安全扫描的成本-信号优化，oauth2-proxy 两个 0-day 说明上下文工程决定漏洞是否可见；真正成熟的漏洞挖掘 Agent，必须把模型推理、上下文构建、多策略搜索和证据验证编排成一个闭环系统。

免责声明：

本文所载程序、技术方法仅面向合法合规的安全研究与教学场景，旨在提升网络安全防护能力，具有明确的技术研究属性。

任何单位或个人未经授权，将本文内容用于攻击、破坏等非法用途的，由此引发的全部法律责任、民事赔偿及连带责任，均由行为人独立承担，本站不承担任何连带责任。

本站内容均为技术交流与知识分享目的发布，若存在版权侵权或其他异议，请通过邮件联系处理，具体联系方式可点击页面上方的联系我。

本文转载自：大山子雪人做安全的小明同学做安全的小明同学《模型不是全部：从 Mythos、Hacktron 与 oauth2-proxy 0-day 看自动化漏洞挖掘系统的真实瓶颈》