2026-02-06 01:25:48 网络安全文章来源：ZONE.CI 全球网 0 阅读模式

文章总结： 本文介绍利用直接系统调用绕过EDRInlineHook以实现隐蔽DLL注入的技术。通过分析WindowsAPI调用链，作者展示了用汇编直接调用未文档化NTAPI规避监控的方法。文章详述了硬编码及动态解析SyscallNumber的过程，并提供利用externC链接C++与汇编的代码示例，有助于红队人员理解底层防御机制及对抗技术。 综合评分： 87 文章分类： 免杀,红队,二进制安全,内网渗透

cover_image

DLL 注入 EDR 规避技术：通过直接 Syscall 绕过 Inline Hook

fluxsec fluxsec

securitainment

2026年2月5日 17:11 中国香港

有时候，进入一个进程最好的方式，就是不请自来、大闹一场！

| 原文链接 | 作者 | | — | — | | https://fluxsec.red/dll-injection-edr-evasion-1 | fluxsec |

简介

项目代码: https://github.com/0xflux/GoSneak

阅读下文即表示你已知悉并同意法律免责声明，详见这里。简而言之，你不得将以下信息用于任何犯罪或不道德目的；这些内容仅应由安全专业人员使用，或供对网络安全感兴趣的人用来加深理解。

Note: 当我们讨论诸如 ETW bypasses、APC Queue Injection、Process Injection 这类 EDR evasion 技术时——更先进的 EDR 仍然可以检测到它们。以现代标准来看，这些技术确实有些过时，但仍然值得学习，因为它们能教会我们一些在某些情况下仍可能奏效的技巧。

如果你对现代 EDR 如何检测这类行为感兴趣，我有一个从零开始构建 EDR 的博客系列，你可以在这里查看关于检测这些 bypass 技术的细节。

我最初写这一节，是打算把我为 red-team framework 编写的 DLL injector “开源”出来：它用 C++ 写核心，再用 Go 包一层，并且还带一个 Go 写的 DLL。本来我想聊聊 CGO 的工作原理、如何在 C++ 里做一些底层操作并返回到 Go 函数；但写着写着就变成了对 Windows internals 的深入学习，并顺带升级了我的 injector。

虽然项目里仍然包含这个 loader 的 Go wrapper，但目前我主要基于从其他博客和安全内容创作者那里获得的研究成果来完善 C++ 实现（大部分基本从头写）。在文章末尾，我会介绍 DLL injector 的 Go wrapper，并展示如何将其作为 Go binary 的一部分使用。

在这个 injector 的基础版本中，我们会将一个未经加密的 payload 写入磁盘。这显然远非”隐蔽执行”的理想方式，但后续会逐步改进（一定会的）:)

这不是一篇 DLL injection 的入门教程。如果你想学习基础知识，网上已经有很多资源可供参考。

再次强调我的道德声明: 这不是教你当 1337 haxx0r 的教程。它的目的，是记录我的学习与成长，并向安全专业人员或对网络安全感兴趣的人展示一些理论。请勿将这些信息、代码或技术用于不道德/违法用途。我不以任何方式认可恶意或非法的计算机使用行为。

EDR Hooking

EDR Hooking 指 Endpoint Detection and Response (EDR) 系统用来监控计算机上软件行为的方法，尤其用于识别与缓解潜在威胁。这类系统会通过观察软件进程与操作系统之间的交互来检测恶意活动。

EDR 执行 hooking 的方式有很多，下面是几种更常见的做法：

Inline Hooking:

EDR 会修改内存中某个函数的实际二进制代码。通常做法是把函数开头的几个字节替换成一条 jump，跳转到 EDR 自己的监控代码。当被 hook 的函数被调用时，执行流会先被引导到 EDR 的代码，使其能够监控或修改该函数的行为。关于如何检测 inline hooking，推荐阅读: https://www.ired.team/offensive-security/defense-evasion/detecting-hooked-syscall-functions。

Import Address Table (IAT) Hooking:

IAT hooking 会修改程序的 import table (导入表)，导入表里列出了程序会使用的 API 函数。这样程序运行时，就不会再调用真正的 API 函数，而是转而调用 EDR 的监控函数。

这和本文 loader 的主题不直接相关，但以下内容引自 ired.team:

这个实验展示了：即便某个安全产品（比如 Cylance，或任何其他使用 userland API hooking 来判断程序执行过程中是否恶意的 Antivirus/Endpoint Detection & Response 方案）采用了此类 hook，依然有可能转储进程内存（lsass）并实现绕过。

我还没来得及亲自尝试，但看起来非常酷，我也想把它集成进我的 red-team framework。

回到 loader……

深入 syscall 兔子洞

既然我们知道 EDR 在盯什么，那就来想个方案。其实也不用我们自己发明轮子：安全社区早已把这些内容记录得非常清楚。既然 EDR 可以 hook API 调用，那么像 https://alice.climent-pommeret.red/posts/a-syscall-journey-in-the-windows-kernel/ (以及更多) 这样的研究，就已经把 API 调用的“旅程”讲得很透彻: 我们如何几乎直接调用 syscalls，把数据送进 kernel，从而绕过 EDR hooking。

我来举个例子说明实际情况。下面是一组 x64dbg 的截图：我先定位程序的 main function，然后寻找对 Windows API 的 VirtualAllocEx()调用（它最终会解析到一个被 EDR hook 的函数）。可以看到，沿着调用链往下追，我们会从 kernel32.dll 进入 kernelbase.dll，最终到达对 NtAllocateVirtualMemory 的调用。继续检查 NtAllocateVirtualMemory（属于未文档化的 Windows NTAPI），能看到汇编里将 18h 放入 eax，然后执行 syscall。对于我的架构与 Windows 版本（下文会说明），18h 就是 kernel 执行 NtAllocateVirtualMemory 时使用的 syscall number，这也展示了 user-land 到 kernel-land 的底层转换。

这里有一个可视化示意：

Windows 中的 syscalls

下面是在 debugger 里观察这条链路：

追到足够深处，我们最终会来到真正发起 syscall 的那段汇编。在我的 Windows 版本上（Windows 11——我知道，你读到这儿可能觉得不爽，其实我自己用着也挺别扭），可以看到 syscall number 是 0x18。想系统学习 syscalls，推荐阅读这篇非常棒的文章： https://alice.climent-pommeret.red/posts/a-syscall-journey-in-the-windows-kernel/

然后我们就可以把它写进我们的 assembly 文件里，像这样；记得把该 procedure 公开导出：

public NtAllocateVirtualMemory
NtAllocateVirtualMemory PROC
mov&nbsp;r10,&nbsp;rcx
moveax,&nbsp;18h
&nbsp; syscall
ret
NtAllocateVirtualMemory ENDP

注意，汇编里需要用 PROC 和 ENDP 来界定一个 procedure 的代码范围。

接着，我们可以一遍又一遍地重复这个过程，直到把所有被 hook 的调用都“解到”对应的 syscall number。

小技巧: 在 x64dbg 里按 ctrl+g可以快速定位符号在反汇编中的位置 (不用像上面那样一直手动往下点到底)。

但这只讲了半个故事。下一步是定义那些未文档化的 NTAPI 函数，并为它们提供一层抽象。我找到的关于 Windows 底层 API 文档的最佳资源之一，是 wininsiders 的 GitHub: https://github.com/winsiderss/phnt。例如，你可以在里面搜索 NtVirtualAllocateMemory。

记住，我们之所以知道要找 NtVirtualAllocateMemory，正是因为在反汇编那一步我们沿着 “proxy chain” 追到了它。

当我们把 assembly 与 C++ 项目结合 (或链接) 起来时，需要新建一个 header file，里面放满函数原型 (function prototypes)，并用 extern "C"关键字标注它们。

这一点至关重要，它能确保 C++ linker 能识别并正确链接这些函数。

C++ 有个细节需要注意：函数名会经过一种叫 name mangling 的处理，使得 linker 看到的符号名变得唯一（主要是为了支持函数重载等特性）。当我们用 extern "C"标注这些函数时，就是在告诉 C++ 编译器采用 C 风格的链接方式，从而避开 name mangling 带来的困扰。这能确保我们在汇编里定义的函数名与 C++ linker 寻找的符号名保持一致，从而顺利完成集成。比如，我们在 header file 里声明 NtAllocateVirtualMemory 的 prototype——这是我们通过 wininsiders GitHub 找到的未文档化 NTAPI 函数——以确保它与汇编实现相匹配。

extern"C"&nbsp;{
&nbsp; &nbsp; NTSTATUS&nbsp;NtAllocateVirtualMemory(
&nbsp; &nbsp; &nbsp; &nbsp; _In_ &nbsp; &nbsp; &nbsp; &nbsp;HANDLE &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;ProcessHandle,
&nbsp; &nbsp; &nbsp; &nbsp; _Inout_&nbsp;_At_&nbsp;(*BaseAddress, _Readable_bytes_(*RegionSize) _Writable_bytes_(*RegionSize) _Post_readable_byte_size_(*RegionSize)) PVOID *BaseAddress,
&nbsp; &nbsp; &nbsp; &nbsp; _In_ &nbsp; &nbsp; &nbsp; &nbsp;ULONG_PTR &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; ZeroBits,
&nbsp; &nbsp; &nbsp; &nbsp; _Inout_ &nbsp; &nbsp; PSIZE_T &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; RegionSize,
&nbsp; &nbsp; &nbsp; &nbsp; _In_ &nbsp; &nbsp; &nbsp; &nbsp;ULONG &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; AllocationType,
&nbsp; &nbsp; &nbsp; &nbsp; _In_ &nbsp; &nbsp; &nbsp; &nbsp;ULONG &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; Protect
&nbsp; &nbsp; );
}

因为 assembly 不支持 name mangling，所以 extern "C"这点千万别忽略。

我们再快速聊一下为什么要定义这些 function prototypes。

在 C 语言中，通常通过声明 function prototypes 来告知编译器函数签名（signature）——它接收哪些参数、返回什么。这一点很关键，因为它会告诉编译器如何发起对该函数的调用，包括如何安排内存或寄存器中的数据。同样的逻辑也适用于这里，但有一个”转折”：我们的函数实现写在 assembly 中，而不是 C。这意味着 C++ 部分会因为 prototypes 而知道”该期待什么”，但真正的工作——那些细枝末节的底层操作——由汇编代码完成。在处理 system calls 时尤其如此。system calls 就像对 kernel 发起的特殊请求，它们期望数据以非常特定的方式提供。通过在 C++ 代码里定义函数 prototype，我们可以确保调用该函数时，编译器会把必要的数据（参数）按照我们的汇编代码——进而按照 kernel——所期望的方式排列好。这些 prototypes 充当了 C++ 的高层结构与汇编/system calls 的底层操作之间的 一座桥梁。

稍微喘口气

在继续深入之前，我们先回顾一下，看看底层到底发生了什么。目前为止我们已知：

我们定义了 kernel 完成 syscall 所需的结构体（以及 kernel 期望我们在栈上如何放置数据）
我们用负责发起 syscalls 的汇编实现，替换了那些会被 EDR hook 的 Windows API 调用与其 proxying 链路
这绕过了 EDR hooks

现在我们来验证：我们”以为”发生的事情，是否真的在发生。正如所见，在我的 Windows 上，0x18 代表 NtAllocateVirtualMemory 的 syscall（用 Windows API 的说法就是 VirtualAllocEx）。查看新的反汇编（因为有对 GetProcAddress 的调用，所以很容易定位），我们能看到对 inj.xxxxxxxxxx 的调用；把鼠标悬停上去，会弹出它对应的符号——正是我们的汇编！

linker 已经正确地把我们的文件组装起来，并把汇编指令纳入最终产物！

纯属好玩，我们把 syscall替换成 int 2E指令（关于 2E 的更多信息见 this link），然后重新编译：

出乎意料的是，这并没有像 syscall 那样工作：我观察到 DLL injection 并没有完成 (而且也没有明显的错误返回)。结合 codemachine 的描述：

“int 2e” 是一种遗留的 user 到 kernel 模式转换方式，当前所有 x86 CPU 都支持它。对 “int 2e” 的调用会触发中断描述符表（IDT）中 vector 0x2e 注册的 interrupt service routine（即 nt!KiSystemService）。

我原本以为在 x64 上它也能以类似方式执行成一个 syscall，但结果并没有！我会很想在 x86 上把它 debug 通、看看它在那里如何工作……但这不是今天要解决的挑战。

自动化

虽然这样做挺不错，但它依赖我们事先知道目标架构、build number 等信息，这并不理想。因此，借助这里的资料 https://redops.at/en/blog/direct-syscalls-vs-indirect-syscalls（感谢 cr0w 提供的基础函数），我们可以把查找 SSNs（syscall numbers）的过程自动化。

下面是我实现的自动化代码：

DWORD&nbsp;getSSN(IN HMODULE dllModule, IN LPCSTR NtFunction) {

&nbsp; &nbsp; FARPROC NtFunctionAddress =&nbsp;GetProcAddress(dllModule, NtFunction);

if&nbsp;(NtFunctionAddress ==&nbsp;NULL) {
char&nbsp;logBuffer[256];
sprintf(logBuffer,&nbsp;"Failed to get the address of %s", NtFunction);
printError(logBuffer);
return0;
&nbsp; &nbsp; }

/**
&nbsp; &nbsp; &nbsp;*
&nbsp; &nbsp; &nbsp;*
&nbsp; &nbsp; &nbsp;* &nbsp;public NtOpenProcess
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; NtOpenProcess PROC
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; mov r10, rcx &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;; 3 bytes
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; mov eax, wNtOpenProcess &nbsp; &nbsp; ; mov (1 byte) + 28h (4 bytes) = 5 bytes
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; syscall
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; ret
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; NtOpenProcess ENDP

&nbsp; &nbsp; &nbsp;* With the below, take the byte pointer of the NT Function, then add 4 bytes to the memory location we are pointing to.
&nbsp; &nbsp; &nbsp;* Here we will find the SSN (see above math).
&nbsp; &nbsp; &nbsp;* Cast this location as a pointer to a double word (i.e. 4 bytes)
&nbsp; &nbsp; &nbsp;* Dereference that pointer, to get the underlying value from where we were pointing.
&nbsp; &nbsp; &nbsp;*
&nbsp; &nbsp; */
&nbsp; &nbsp; DWORD NtFunctionSSN = *((PDWORD)((PBYTE)NtFunctionAddress +&nbsp;4));

return&nbsp;NtFunctionSSN;
}

我尝试解释一下这行代码到底在做什么：

DWORD NtFunctionSSN = *((PDWORD)((PBYTE)NtFunctionAddress +&nbsp;4));

如果你觉得不好理解，可以这样想：mov r10, rcx的汇编指令长度是 3 个字节，第 4 个字节是 mov eax，而剩下的 4 个字节（DWORD）就是实际的 SSN。

到这里，通过 dumpbin 工具：

dumpbin /imports .\inj.exe

我能看到像 OpenProcess 这类调用没有出现（很好），但显然还有潜在的改进空间（如果这些函数里有任何会让 EDR 起疑的点）：

查看 dumpbin 输出。

下一步

如果你喜欢这篇文章，你可能也会喜欢下面这些文章：

Hells Gate Rust – EDR Evasion with syscalls
EDR Evasion ETW patching in Rust
Remote process DLL injection in Rust

免责声明：本博客文章仅用于教育和研究目的。提供的所有技术和代码示例旨在帮助防御者理解攻击手法并提高安全态势。请勿使用此信息访问或干扰您不拥有或没有明确测试权限的系统。未经授权的使用可能违反法律和道德准则。作者对因应用所讨论概念而导致的任何误用或损害不承担任何责任。

免责声明：

本文所载程序、技术方法仅面向合法合规的安全研究与教学场景，旨在提升网络安全防护能力，具有明确的技术研究属性。

任何单位或个人未经授权，将本文内容用于攻击、破坏等非法用途的，由此引发的全部法律责任、民事赔偿及连带责任，均由行为人独立承担，本站不承担任何连带责任。

本站内容均为技术交流与知识分享目的发布，若存在版权侵权或其他异议，请通过邮件联系处理，具体联系方式可点击页面上方的联系我。

本文转载自：securitainment fluxsec fluxsec《DLL 注入 EDR 规避技术：通过直接 Syscall 绕过 Inline Hook》