2025-12-29 00:54:39 网络安全文章来源：ZONE.CI 全球网 0 阅读模式

文章总结： 本文讲解IDA无PDB时识别函数参数的原理，重点分析基于堆栈遍历与模式匹配的启发式算法。通过剖析Windowsx64调用约定及函数序言，利用栈读写操作与置信度计算区分参数与局部变量，成功复现IDA识别结果，并指出需结合调用方分析以提升准确性。 综合评分： 88 文章分类： 逆向分析,二进制安全

cover_image

IDA原理入门(四): 函数参数识别

原创

huoji

冲鸭安全

2025年5月19日 10:01 北京

简介

没看过的请看之前的第三篇。本篇是第四篇，讲述怎么计算函数参数

IDA原理入门(三): 控制流追踪与CFG Blocks构建

函数参数识别

我们必须要知道一个事实是, 在没PDB之前，是没有一个准确的函数识别办法的，现如今所有的办法都是启发式的办法，这也是为什么逆向工具在遇到混淆的时候会拉闸。我们这边不介绍复杂的启发算法，只说一个简单的： 堆栈遍历+模式匹配 (只说X64的)

Windows x64调用约定

在讨论参数识别之前，我们需要了解Windows x64调用约定：

前4个参数通过寄存器传递：RCX, RDX, R8, R9 超过4个的参数通过栈传递，从RSP+0x20开始即使参数少于4个，编译器也会在栈上为它们预留空间（称为”shadow space”）以下是参数在栈上的布局：

RSP+0x08: 第1个参数的影子空间 (RCX) RSP+0x10: 第2个参数的影子空间 (RDX) RSP+0x18: 第3个参数的影子空间 (R8) RSP+0x20: 第4个参数的影子空间 (R9) RSP+0x28: 第5个参数（第一个栈参数） RSP+0x30: 第6个参数（第二个栈参数）以此类推…

实际案例分析：8参数函数

让我们通过一个具体例子来理解参数识别过程。以下是一个接受8个整型参数的函数的C代码和反汇编：

int __fastcall sub_140012510(int a1, int a2, int a3, int a4, int a5, int a6, int a7, int a8)
{
&nbsp; j___CheckForDebuggerJustMyCode(byte_140023029);
&nbsp; return j_printf("wtf: %d \n", (unsigned int)(a8 + a7 + a6 + a5 + a4 + a3 + a2 + a1));
}

关键汇编指令：

.text:0000000140012510 44 89 4C 24 20 &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;mov &nbsp; &nbsp; [rsp-8+arg_18], r9d &nbsp; &nbsp; ; 保存第4个参数(a4)到影子空间
.text:0000000140012515 44 89 44 24 18 &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;mov &nbsp; &nbsp; [rsp-8+arg_10], r8d &nbsp; &nbsp; ; 保存第3个参数(a3)到影子空间
.text:000000014001251A 89 54 24 10 &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; mov &nbsp; &nbsp; [rsp-8+arg_8], edx &nbsp; &nbsp; &nbsp;; 保存第2个参数(a2)到影子空间
.text:000000014001251E 89 4C 24 08 &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; mov &nbsp; &nbsp; [rsp-8+arg_0], ecx &nbsp; &nbsp; &nbsp;; 保存第1个参数(a1)到影子空间
.text:0000000140012522 55 &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;push &nbsp; &nbsp;rbp &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; ; 函数序言开始
.text:0000000140012523 57 &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;push &nbsp; &nbsp;rdi
.text:0000000140012524 48 81 EC E8 00 00 00 &nbsp; &nbsp;sub &nbsp; &nbsp; rsp, 0E8h &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; ; 分配栈空间
.text:000000014001252B 48 8D 6C 24 20 &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;lea &nbsp; &nbsp; rbp, [rsp+20h] &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;; 设置帧指针
...
.text:000000014001253C 8B 85 E8 00 00 00 &nbsp; &nbsp; &nbsp; mov &nbsp; &nbsp; eax, [rbp+0D0h+arg_8] &nbsp; ; 访问a2
.text:0000000140012542 8B 8D E0 00 00 00 &nbsp; &nbsp; &nbsp; mov &nbsp; &nbsp; ecx, [rbp+0D0h+arg_0] &nbsp; ; 访问a1
.text:0000000140012548 03 C8 &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; add &nbsp; &nbsp; ecx, eax &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;; a1 + a2
.text:000000014001254A 8B C1 &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; mov &nbsp; &nbsp; eax, ecx
.text:000000014001254C 03 85 F0 00 00 00 &nbsp; &nbsp; &nbsp; add &nbsp; &nbsp; eax, [rbp+0D0h+arg_10] &nbsp;; + a3
.text:0000000140012552 03 85 F8 00 00 00 &nbsp; &nbsp; &nbsp; add &nbsp; &nbsp; eax, [rbp+0D0h+arg_18] &nbsp;; + a4
.text:0000000140012558 03 85 00 01 00 00 &nbsp; &nbsp; &nbsp; add &nbsp; &nbsp; eax, [rbp+0D0h+arg_20] &nbsp;; + a5
.text:000000014001255E 03 85 08 01 00 00 &nbsp; &nbsp; &nbsp; add &nbsp; &nbsp; eax, [rbp+0D0h+arg_28] &nbsp;; + a6
.text:0000000140012564 03 85 10 01 00 00 &nbsp; &nbsp; &nbsp; add &nbsp; &nbsp; eax, [rbp+0D0h+arg_30] &nbsp;; + a7
.text:000000014001256A 03 85 18 01 00 00 &nbsp; &nbsp; &nbsp; add &nbsp; &nbsp; eax, [rbp+0D0h+arg_38] &nbsp;; + a8

“启发式”参数识别过程

原理

简单来说我给每个寄存器访问的偏移搜集了一次,然后做置信度,一旦置信度超过X我们就认为是一个参数了.

分析函数序言

函数序言是识别栈帧结构的关键.通过跟踪PUSH RBP和SUB RSP, xxx等指令，我们可以计算栈调整大小并确定序言何时结束。

if (llil_ins->type == LLIL::LLILInstruction::ARITHMETIC &&
&nbsp; &nbsp; llil_ins->op == LLIL::LLILInstruction::PUSH) {
&nbsp; &nbsp; func->stackAdjustment += 8;
&nbsp; &nbsp; if (!llil_ins->operands.empty() &&
&nbsp; &nbsp; &nbsp; &nbsp; llil_ins->operands[0]->type == LLIL::LLILOperand::REGISTER &&
&nbsp; &nbsp; &nbsp; &nbsp; llil_ins->operands[0]->value == "rbp") {
&nbsp; &nbsp; &nbsp; &nbsp; func->foundPushRbp = true;
&nbsp; &nbsp; }
} else if (llil_ins->type == LLIL::LLILInstruction::ARITHMETIC &&
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;llil_ins->op == LLIL::LLILInstruction::SUB &&
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;llil_ins->operands.size() >= 2 &&
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;llil_ins->operands[0]->type == LLIL::LLILOperand::REGISTER &&
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;llil_ins->operands[0]->value == "rsp" &&
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;llil_ins->operands[1]->type == LLIL::LLILOperand::IMMEDIATE) {
&nbsp; &nbsp; func->stackAdjustment += llil_ins->operands[1]->immediate;
&nbsp; &nbsp; func->inPrologue = false;
}

匹配全部栈读写的操作

首先，我们遍历函数的所有指令，收集对栈的读写操作：我们跟踪每个内存操作，区分读取和写入，并记录指令类型和位置。尤其重要的是，我们识别出是否在函数序言（prologue）之前发生了访问，这通常表明是在访问参数。

// 记录栈访问
auto processMemOperand = [&](const LLIL::LLILOperand op, bool isDest) {
&nbsp; &nbsp; if (op.type != LLIL::LLILOperand::MEMORY) return;

&nbsp; &nbsp; int64_t offset;
&nbsp; &nbsp; if (op.base == "rsp") {
&nbsp; &nbsp; &nbsp; &nbsp; offset = op.offset;
&nbsp; &nbsp; } else if (op.base == "rbp") {
&nbsp; &nbsp; &nbsp; &nbsp; if (func->inPrologue) {
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; offset = func->rbpRspOffset + op.offset;
&nbsp; &nbsp; &nbsp; &nbsp; } else {
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; offset = op.offset; &nbsp;// 直接使用相对rbp的偏移
&nbsp; &nbsp; &nbsp; &nbsp; }
&nbsp; &nbsp; } else {
&nbsp; &nbsp; &nbsp; &nbsp; return;
&nbsp; &nbsp; }
&nbsp; &nbsp; StackAccess access;
&nbsp; &nbsp; access.isRead = !isDest;
&nbsp; &nbsp; access.isLea = llil_ins->type == LLIL::LLILInstruction::LEA;
&nbsp; &nbsp; access.insIndex = insIndex;
&nbsp; &nbsp; access.isBeforePrologue = func->inPrologue;
&nbsp; &nbsp; access.ins = llil_ins.get();
&nbsp; &nbsp; stackAccesses[offset].push_back(access);
};

以我们的汇编为例,我们搜集到了12处栈访问的信息: 实际上IDA识别出来是有8个: 仔细看我们前四个的搜集的栈访问参数位置: 其实都是因为X64的传参造成的栈访问,这前四个很快就会被接下里的可信度计算过滤了: 观察剩下的0xe0的栈访问实际上他就是一个”参数”

参数识别和置信度计算

收集完栈访问信息后，我们分析每个偏移处的访问模式：

for (const auto& [offset, accesses] : stackAccesses) {
&nbsp; &nbsp; if (offset < 0x8 || (offset % 8) != 0) continue;
&nbsp; &nbsp; if (handledOffsets.find(offset) != handledOffsets.end()) continue;

&nbsp; &nbsp; bool hasRealAccess = false;
&nbsp; &nbsp; bool onlyLea = true;
&nbsp; &nbsp; bool usedBeforePrologue = false;
&nbsp; &nbsp; uint64_t firstAccessAddr = 0;

&nbsp; &nbsp; for (const auto& access : accesses) {
&nbsp; &nbsp; &nbsp; &nbsp; if (!access.isLea) {
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; hasRealAccess = true;
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; onlyLea = false;
&nbsp; &nbsp; &nbsp; &nbsp; }
&nbsp; &nbsp; &nbsp; &nbsp; if (access.isBeforePrologue) {
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; usedBeforePrologue = true;
&nbsp; &nbsp; &nbsp; &nbsp; }
&nbsp; &nbsp; &nbsp; &nbsp; if (firstAccessAddr == 0) {
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; firstAccessAddr = access.ins->address;
&nbsp; &nbsp; &nbsp; &nbsp; }
&nbsp; &nbsp; }

我们根据以下几点计算参数的置信度：

是否在序言之前被访问（增加置信度）
是否只有LEA指令访问（可能只是局部变量）
对于栈参数，是否有写操作（参数通常是只读的）

auto info = std::make_shared<ParamInfo>();
info->offset = offset;
info->confidence = 0.6f; &nbsp;// 基础置信度
info->insMemAddress = firstAccessAddr;
if (usedBeforePrologue) info->confidence += 0.1f;

// 对于栈参数，检查是否有写操作
if (offset >= 0x28) { &nbsp;// 栈参数
&nbsp; &nbsp; bool hasWrite = false;
&nbsp; &nbsp; for (const auto& access : accesses) {
&nbsp; &nbsp; &nbsp; &nbsp; if (!access.isRead && !access.isLea) {
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; hasWrite = true;
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; break;
&nbsp; &nbsp; &nbsp; &nbsp; }
&nbsp; &nbsp; }
&nbsp; &nbsp; // 如果没有写操作，增加置信度
&nbsp; &nbsp; if (!hasWrite) info->confidence += 0.1f;
}

这确保了参数按照预期的顺序排列：先是寄存器参数（RCX, RDX, R8, R9），然后是栈参数。一个简单的函数识别就OK了,更复杂一点的,我们可以看

call这个函数之前的模式匹配,结合函数内的匹配会很精准
在（1）的基础上,多次匹配其他的call的地方.

最终结果

那么,我们来看看我们的最终结果吧: 最后,我们看到,函数参数有8个,跟IDA的结果是一样的: 我们也正确的显示了参数列表:

后续问题

如您所见,这并不是最优解,比如在某些情况下,会多一个参数或者少一个参数这并不是最优解,最优解是我们还需要根据函数的调用方去分析一次参数,然后两者比对.这个我们会在后面会说

免责声明：

本文所载程序、技术方法仅面向合法合规的安全研究与教学场景，旨在提升网络安全防护能力，具有明确的技术研究属性。

任何单位或个人未经授权，将本文内容用于攻击、破坏等非法用途的，由此引发的全部法律责任、民事赔偿及连带责任，均由行为人独立承担，本站不承担任何连带责任。

本站内容均为技术交流与知识分享目的发布，若存在版权侵权或其他异议，请通过邮件联系处理，具体联系方式可点击页面上方的联系我。

本文转载自：冲鸭安全 huoji《IDA原理入门(四): 函数参数识别》