2025-12-29 00:54:45 网络安全文章来源：ZONE.CI 全球网 0 阅读模式

文章总结： 本文讲解了IDA计算函数大小的原理。指出直接查找首个ret指令不准确，因为编译器优化会产生多个ret。正确方法是进行控制流分析，追踪所有跳转路径，记录合法范围内的所有ret位置，取最大地址作为函数结束点。通过对比IDA结果验证了该启发式算法的有效性，并说明了地址合法性校验的重要性。 综合评分： 86 文章分类： 逆向分析,二进制安全,安全工具

cover_image

IDA背后的原理入门(二): 函数大小计算

原创

冲鸭安全

2025年2月6日 10:02 北京

简介

我们上一章已经成功的得到了函数的列表:

IDA背后的原理入门(一): 简介&函数识别

现在,我们遇到了一个麻烦: 函数大小计算

可能有一些逆向人认为,直接查找ret即可完成任务。知道了函数开头，直接查找第一次出现的ret就是函数的大小。但是事实并非如此，并且函数大小计算是一个充满启发性的计算。不是想象中那么容易能算出来的。让我们一步一步的说明为什么

存在的问题

如果我们直接看函数的ret会怎么样？看第一次出现的ret，就能非常轻松的确定是哪个。如果实际这样做过，你就会发现，大小完全的不准。这是因为各个编译器编译参数不一样,不一定是以ret为函数结尾

比如这个函数:

int test_function(int x) {
&nbsp; &nbsp; volatile int a = x;
&nbsp; &nbsp; if (a > 0) {
&nbsp; &nbsp; &nbsp; &nbsp; for (int i = 0; i < a; i++) {
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; if (i * i > a) {
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; return a + i;
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; }
&nbsp; &nbsp; &nbsp; &nbsp; }
&nbsp; &nbsp; &nbsp; &nbsp; return a + 1;
&nbsp; &nbsp; }
&nbsp; &nbsp; return a - 1;
}

在clang的情况下,是这样:

test_function(int):
&nbsp; &nbsp; &nbsp; &nbsp; push &nbsp; &nbsp;rbp
&nbsp; &nbsp; &nbsp; &nbsp; mov &nbsp; &nbsp; rbp, rsp
&nbsp; &nbsp; &nbsp; &nbsp; mov &nbsp; &nbsp; dword ptr [rbp - 8], edi
&nbsp; &nbsp; &nbsp; &nbsp; mov &nbsp; &nbsp; eax, dword ptr [rbp - 8]
&nbsp; &nbsp; &nbsp; &nbsp; mov &nbsp; &nbsp; dword ptr [rbp - 12], eax
&nbsp; &nbsp; &nbsp; &nbsp; mov &nbsp; &nbsp; eax, dword ptr [rbp - 12]
&nbsp; &nbsp; &nbsp; &nbsp; cmp &nbsp; &nbsp; eax, 0
&nbsp; &nbsp; &nbsp; &nbsp; jle &nbsp; &nbsp; .LBB0_8
&nbsp; &nbsp; &nbsp; &nbsp; mov &nbsp; &nbsp; dword ptr [rbp - 16], 0
.LBB0_2:
&nbsp; &nbsp; &nbsp; &nbsp; mov &nbsp; &nbsp; eax, dword ptr [rbp - 16]
&nbsp; &nbsp; &nbsp; &nbsp; mov &nbsp; &nbsp; ecx, dword ptr [rbp - 12]
&nbsp; &nbsp; &nbsp; &nbsp; cmp &nbsp; &nbsp; eax, ecx
&nbsp; &nbsp; &nbsp; &nbsp; jge &nbsp; &nbsp; .LBB0_7
&nbsp; &nbsp; &nbsp; &nbsp; mov &nbsp; &nbsp; eax, dword ptr [rbp - 16]
&nbsp; &nbsp; &nbsp; &nbsp; imul &nbsp; &nbsp;eax, dword ptr [rbp - 16]
&nbsp; &nbsp; &nbsp; &nbsp; mov &nbsp; &nbsp; ecx, dword ptr [rbp - 12]
&nbsp; &nbsp; &nbsp; &nbsp; cmp &nbsp; &nbsp; eax, ecx
&nbsp; &nbsp; &nbsp; &nbsp; jle &nbsp; &nbsp; .LBB0_5
&nbsp; &nbsp; &nbsp; &nbsp; mov &nbsp; &nbsp; eax, dword ptr [rbp - 12]
&nbsp; &nbsp; &nbsp; &nbsp; add &nbsp; &nbsp; eax, dword ptr [rbp - 16]
&nbsp; &nbsp; &nbsp; &nbsp; mov &nbsp; &nbsp; dword ptr [rbp - 4], eax
&nbsp; &nbsp; &nbsp; &nbsp; jmp &nbsp; &nbsp; .LBB0_9
.LBB0_5:
&nbsp; &nbsp; &nbsp; &nbsp; jmp &nbsp; &nbsp; .LBB0_6
.LBB0_6:
&nbsp; &nbsp; &nbsp; &nbsp; mov &nbsp; &nbsp; eax, dword ptr [rbp - 16]
&nbsp; &nbsp; &nbsp; &nbsp; add &nbsp; &nbsp; eax, 1
&nbsp; &nbsp; &nbsp; &nbsp; mov &nbsp; &nbsp; dword ptr [rbp - 16], eax
&nbsp; &nbsp; &nbsp; &nbsp; jmp &nbsp; &nbsp; .LBB0_2
.LBB0_7:
&nbsp; &nbsp; &nbsp; &nbsp; mov &nbsp; &nbsp; eax, dword ptr [rbp - 12]
&nbsp; &nbsp; &nbsp; &nbsp; add &nbsp; &nbsp; eax, 1
&nbsp; &nbsp; &nbsp; &nbsp; mov &nbsp; &nbsp; dword ptr [rbp - 4], eax
&nbsp; &nbsp; &nbsp; &nbsp; jmp &nbsp; &nbsp; .LBB0_9
.LBB0_8:
&nbsp; &nbsp; &nbsp; &nbsp; mov &nbsp; &nbsp; eax, dword ptr [rbp - 12]
&nbsp; &nbsp; &nbsp; &nbsp; sub &nbsp; &nbsp; eax, 1
&nbsp; &nbsp; &nbsp; &nbsp; mov &nbsp; &nbsp; dword ptr [rbp - 4], eax
.LBB0_9:
&nbsp; &nbsp; &nbsp; &nbsp; mov &nbsp; &nbsp; eax, dword ptr [rbp - 4]
&nbsp; &nbsp; &nbsp; &nbsp; pop &nbsp; &nbsp; rbp
&nbsp; &nbsp; &nbsp; &nbsp; ret

开了-o2优化后,会变成这样

test_function(int):
&nbsp; &nbsp; &nbsp; &nbsp; mov &nbsp; &nbsp; dword ptr [rsp - 4], edi
&nbsp; &nbsp; &nbsp; &nbsp; cmp &nbsp; &nbsp; dword ptr [rsp - 4], 0
&nbsp; &nbsp; &nbsp; &nbsp; mov &nbsp; &nbsp; eax, dword ptr [rsp - 4]
&nbsp; &nbsp; &nbsp; &nbsp; jle &nbsp; &nbsp; .LBB0_7
&nbsp; &nbsp; &nbsp; &nbsp; test &nbsp; &nbsp;eax, eax
&nbsp; &nbsp; &nbsp; &nbsp; jle &nbsp; &nbsp; .LBB0_6
&nbsp; &nbsp; &nbsp; &nbsp; xor &nbsp; &nbsp; eax, eax
.LBB0_3:
&nbsp; &nbsp; &nbsp; &nbsp; mov &nbsp; &nbsp; ecx, eax
&nbsp; &nbsp; &nbsp; &nbsp; imul &nbsp; &nbsp;ecx, eax
&nbsp; &nbsp; &nbsp; &nbsp; cmp &nbsp; &nbsp; ecx, dword ptr [rsp - 4]
&nbsp; &nbsp; &nbsp; &nbsp; jg &nbsp; &nbsp; &nbsp;.LBB0_4
&nbsp; &nbsp; &nbsp; &nbsp; inc &nbsp; &nbsp; eax
&nbsp; &nbsp; &nbsp; &nbsp; cmp &nbsp; &nbsp; eax, dword ptr [rsp - 4]
&nbsp; &nbsp; &nbsp; &nbsp; jl &nbsp; &nbsp; &nbsp;.LBB0_3
.LBB0_6:
&nbsp; &nbsp; &nbsp; &nbsp; mov &nbsp; &nbsp; eax, dword ptr [rsp - 4]
&nbsp; &nbsp; &nbsp; &nbsp; inc &nbsp; &nbsp; eax
&nbsp; &nbsp; &nbsp; &nbsp; ret
.LBB0_7:
&nbsp; &nbsp; &nbsp; &nbsp; dec &nbsp; &nbsp; eax
&nbsp; &nbsp; &nbsp; &nbsp; ret
.LBB0_4:
&nbsp; &nbsp; &nbsp; &nbsp; add &nbsp; &nbsp; eax, dword ptr [rsp - 4]
&nbsp; &nbsp; &nbsp; &nbsp; ret

非常明显，我们多了几个RET,这是因为: 不开优化（-O0）时：

编译器会按照代码的字面顺序直接翻译所有返回语句通常会跳转到函数末尾的一个公共返回点这样做便于调试，因为执行路径更直观

开启优化（-O2）时：

编译器会尝试优化执行路径，减少指令数量如果发现直接返回比跳转到公共返回点更高效，就会生成多个ret 这样可以省去额外的跳转指令

所以,直接找ret不可取,并且导致了一个麻烦的结论

所有软件对函数大小的计算,都是启发性的,并不能精准识别.

我们需要一个更加聪明的办法.

聪明的办法

要准确计算函数大小，我们需要分析函数的控制流。主要思路是：追踪所有可能的执行路径，直到找到所有可能的结束点。

具体来说,我们需要：

追踪所有的跳转指令（jmp, jz, jnz等）
分析条件分支创造的多个执行路径
找到每个路径的终点（ret指令）
取所有终点中地址最大的那个作为函数结束位置

这将会尽可能的找到我们需要的函数方向.

具体实现

基本的实现流程如下：

function FindFunctionEnd(startAddress):
&nbsp; &nbsp; 1. 反汇编当前地址的指令
&nbsp; &nbsp; 2. 如果是返回指令，记录当前位置+指令长度
&nbsp; &nbsp; 3. 如果是跳转指令：
&nbsp; &nbsp; &nbsp; &nbsp;- 验证跳转目标的合法性
&nbsp; &nbsp; &nbsp; &nbsp;- 递归分析跳转目标
&nbsp; &nbsp; &nbsp; &nbsp;- 继续分析当前路径
&nbsp; &nbsp; 4. 返回找到的最远结束地址

为了避免重复分析和无限递归，我们需要：

使用哈希表记录已分析过的地址
检查跳转目标是否在合理范围内
防止向低地址的非法跳转

关键点处理

跳转指令分析：

if (isJump(instruction)) {
&nbsp; &nbsp; // 获取跳转目标
&nbsp; &nbsp; targetAddress = getJumpTarget(instruction);

&nbsp; &nbsp; // 验证目标地址
&nbsp; &nbsp; if (isValidTarget(targetAddress)) {
&nbsp; &nbsp; &nbsp; &nbsp; // 递归分析新路径
&nbsp; &nbsp; &nbsp; &nbsp; endAddr = max(endAddr, FindFunctionEnd(targetAddress));
&nbsp; &nbsp; }
}

这样会追踪所有可能的执行路径

示例代码：
if (x > 0) {
&nbsp; &nbsp; return 1;
} else {
&nbsp; &nbsp; return 2;
}

汇编代码：
&nbsp; &nbsp; cmp eax, 0
&nbsp; &nbsp; jle else_branch &nbsp; // 条件跳转,创建两条路径
&nbsp; &nbsp; mov eax, 1
&nbsp; &nbsp; ret &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;// 路径1的结束点
else_branch:
&nbsp; &nbsp; mov eax, 2
&nbsp; &nbsp; ret &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;// 路径2的结束点

如果不分析跳转,就会漏掉else分支的ret,导致函数大小计算错误

另外这个不允许向上跳转，向上跳转则认为这个跳转没意义,我们假设代码是从下到上的.因此还需要地址合法性检查

地址合法性检查代码:

bool isValidTarget(targetAddress) {
&nbsp; &nbsp; // 不允许向低地址跳转
&nbsp; &nbsp; if (targetAddress < functionStart)
&nbsp; &nbsp; &nbsp; &nbsp; return false;

&nbsp; &nbsp; // 不允许跳出代码段
&nbsp; &nbsp; if (targetAddress > codeSegmentEnd)
&nbsp; &nbsp; &nbsp; &nbsp; return false;

&nbsp; &nbsp; // 避免重复分析
&nbsp; &nbsp; if (alreadyAnalyzed(targetAddress))
&nbsp; &nbsp; &nbsp; &nbsp; return false;

&nbsp; &nbsp; return true;
}

地址合法性检查存在的意义是,防止向下跳转导致的误判,如下所示:

function_A:
&nbsp; &nbsp; ...
function_B:
&nbsp; &nbsp; jmp function_A &nbsp; // 如果允许向下跳转，可能误判为function_B的一部分

以及避免跨段访问

.text:
&nbsp; &nbsp; function_start:
&nbsp; &nbsp; &nbsp; &nbsp; jmp data_section &nbsp;// 不允许跳转到数据段
.data:
&nbsp; &nbsp; data_section:
&nbsp; &nbsp; &nbsp; &nbsp; db "Hello"

这样,我们终于能安心的寻找最后一个RET了:

测试

上面的代码,IDA里面大小是0x50: 而以上的实现,也是0x50: 如果直接看单独ret的存在,我们是没办法这样匹配的

未完待续

下一章,我们将介绍如何做程序控制流访问控制.当文章阅读过1000就马上更新!

另外加入了鸭鸭粉丝俱乐部的各位，可以在本公众号的微信群询问鸭哥要关于本章的DEMO以及指导(如果对做IDA感兴趣的话).感谢各位兄弟们的支持!

如果没有加入,速速私聊加入.

免责声明：

本文所载程序、技术方法仅面向合法合规的安全研究与教学场景，旨在提升网络安全防护能力，具有明确的技术研究属性。

任何单位或个人未经授权，将本文内容用于攻击、破坏等非法用途的，由此引发的全部法律责任、民事赔偿及连带责任，均由行为人独立承担，本站不承担任何连带责任。

本站内容均为技术交流与知识分享目的发布，若存在版权侵权或其他异议，请通过邮件联系处理，具体联系方式可点击页面上方的联系我。

本文转载自：冲鸭安全冲鸭安全《IDA背后的原理入门(二): 函数大小计算》