Android逆向视角下的Protobuf协议分析(下篇):FridaHook、对抗绕过与工具速查

admin 2026-04-16 04:03:04 网络安全文章 来源:ZONE.CI 全球网 0 阅读模式

文章总结: 本文是Android逆向分析Protobuf协议系列的终篇,聚焦动态分析实战,详细介绍了使用Frida工具实时HookProtobuf序列化/反序列化过程的方法,包括拦截writeTo、parseFrom及CodedOutputStream字段级操作,并提供了应对常见对抗手段的策略与工具速查表。 综合评分: 89 文章分类: 逆向分析,移动安全,安全工具


cover_image

Android 逆向视角下的 Protobuf 协议分析(下篇):Frida Hook、对抗绕过与工具速查

原创

泡泡以安 泡泡以安

泡泡以安

2026年4月3日 09:09 浙江

在小说阅读器读本章

去阅读

系列说明:本文是「Android 逆向视角下的 Protobuf 协议分析」系列的第三篇(终篇),聚焦动态分析实战。前两篇已完成理论基础和静态还原,本篇进入运行时层面——用 Frida 实时捕获和篡改 protobuf 数据,并应对常见的对抗手段。

上篇:[基础理论篇] —— Protobuf 概念、Wire Format 编码原理、流量识别

中篇:[解码与还原篇] —— 如何解码 protobuf 数据、从代码/descriptor/网络数据还原 .proto 定义


目录

  • 六、Frida 动态 Hook 实战
  • 七、常见对抗与绕过
  • 八、工具速查表
  • 附录:实战 Cheat Sheet

本篇背景衔接:经过前两篇的学习,我们已经能够:① 识别应用是否使用 protobuf;② 将二进制数据解码为可读字段;③ 通过静态分析还原 .proto 定义。但静态分析有一个局限——当数据经过加密、混淆或自定义处理时,光靠离线解码往往不够。Frida 动态 Hook 的价值在于:无论外层套了多少层处理,数据在被 protobuf 序列化之前、反序列化之后,一定是明文状态。这是动态分析的根本优势。


六、Frida 动态 Hook 实战

静态分析可以还原 .proto 定义,而动态分析(Frida Hook)可以在运行时实时捕获、解码和篡改 protobuf 数据。两者结合是逆向 protobuf 协议最完整的方案。

6.1 Hook writeTo – 拦截序列化

拦截所有 protobuf message 的序列化出口,捕获发送前的明文数据:

// hook_protobuf_writeto.js
// 拦截所有 protobuf message 的序列化, 捕获序列化后的二进制数据
Java.perform(function() {

    // Hook GeneratedMessageLite.toByteArray (protobuf-lite 最常用的序列化方法)
    var MessageLite = Java.use("com.google.protobuf.GeneratedMessageLite");

    MessageLite.toByteArray.implementation = function() {
        // 调用原方法获取序列化结果
        var result = this.toByteArray();
        // 获取实际的 Message 类名 (即使父类方法被 Hook, 也能拿到子类名)
        var className = this.getClass().getName();

        console.log("\n[*] Protobuf Serialize: " + className);
        console.log("[*] Size: " + result.length + " bytes");

        // 将 byte[] 转为十六进制字符串 (限制最多打印 512 字节, 避免刷屏)
        var hex = "";
&nbsp; &nbsp; &nbsp; &nbsp;&nbsp;for&nbsp;(var&nbsp;i =&nbsp;0; i < result.length && i <&nbsp;512; i++) {
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; hex += ("0"&nbsp;+ (result[i] &&nbsp;0xFF).toString(16)).slice(-2) +&nbsp;" ";
&nbsp; &nbsp; &nbsp; &nbsp; }
&nbsp; &nbsp; &nbsp; &nbsp;&nbsp;console.log("[*] Hex: "&nbsp;+ hex.trim());

&nbsp; &nbsp; &nbsp; &nbsp;&nbsp;// 尝试打印 toString (部分 Message 类会生成可读的 toString 输出)
&nbsp; &nbsp; &nbsp; &nbsp;&nbsp;try&nbsp;{
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;&nbsp;console.log("[*] Content: "&nbsp;+&nbsp;this.toString());
&nbsp; &nbsp; &nbsp; &nbsp; }&nbsp;catch(e) {
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;&nbsp;// toString 可能未实现或抛异常, 忽略
&nbsp; &nbsp; &nbsp; &nbsp; }

&nbsp; &nbsp; &nbsp; &nbsp;&nbsp;// 将原始二进制数据保存到文件, 方便后续用 protoc 离线分析
&nbsp; &nbsp; &nbsp; &nbsp;&nbsp;var&nbsp;ts =&nbsp;Date.now();
&nbsp; &nbsp; &nbsp; &nbsp;&nbsp;var&nbsp;path =&nbsp;"/data/local/tmp/pb_out_"&nbsp;+ ts +&nbsp;".bin";
&nbsp; &nbsp; &nbsp; &nbsp;&nbsp;var&nbsp;fos = Java.use("java.io.FileOutputStream").$new(path);
&nbsp; &nbsp; &nbsp; &nbsp; fos.write(result);
&nbsp; &nbsp; &nbsp; &nbsp; fos.close();
&nbsp; &nbsp; &nbsp; &nbsp;&nbsp;console.log("[*] Saved to: "&nbsp;+ path);

&nbsp; &nbsp; &nbsp; &nbsp;&nbsp;// 返回原始结果, 不篡改
&nbsp; &nbsp; &nbsp; &nbsp;&nbsp;return&nbsp;result;
&nbsp; &nbsp; };
});

适用场景:这个 Hook 拦截的是所有 protobuf-lite Message 的序列化出口(toByteArray),无需知道具体的 Message 类名。缺点是可能会产生大量输出(如果应用频繁序列化 protobuf),可以通过 className 过滤只关注特定的 Message 类。

6.2 Hook parseFrom – 拦截反序列化

拦截特定 Message 类的反序列化入口,捕获接收到的数据:

// hook_protobuf_parsefrom.js
// 拦截特定 Message 类的反序列化, 捕获原始二进制数据和解码后的对象
Java.perform(function()&nbsp;{

&nbsp; &nbsp;&nbsp;// 找到目标 Message 类 (替换为实际的类名)
&nbsp; &nbsp;&nbsp;var&nbsp;TargetMessage = Java.use("com.example.app.proto.UserResponse");

&nbsp; &nbsp;&nbsp;// Hook parseFrom(byte[]) 重载 - 这是最常用的反序列化入口
&nbsp; &nbsp; TargetMessage.parseFrom.overload('[B').implementation =&nbsp;function(data)&nbsp;{
&nbsp; &nbsp; &nbsp; &nbsp;&nbsp;console.log("\n[*] parseFrom called on: "&nbsp;+&nbsp;this.getClass().getName());

&nbsp; &nbsp; &nbsp; &nbsp;&nbsp;// 打印原始二进制数据 (十六进制)
&nbsp; &nbsp; &nbsp; &nbsp;&nbsp;var&nbsp;hex =&nbsp;"";
&nbsp; &nbsp; &nbsp; &nbsp;&nbsp;for&nbsp;(var&nbsp;i =&nbsp;0; i < data.length && i <&nbsp;512; i++) {
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; hex += ("0"&nbsp;+ (data[i] &&nbsp;0xFF).toString(16)).slice(-2) +&nbsp;" ";
&nbsp; &nbsp; &nbsp; &nbsp; }
&nbsp; &nbsp; &nbsp; &nbsp;&nbsp;console.log("[*] Raw data ("&nbsp;+ data.length +&nbsp;" bytes): "&nbsp;+ hex.trim());

&nbsp; &nbsp; &nbsp; &nbsp;&nbsp;// 保存原始数据到文件
&nbsp; &nbsp; &nbsp; &nbsp;&nbsp;var&nbsp;path =&nbsp;"/data/local/tmp/pb_in_"&nbsp;+&nbsp;Date.now() +&nbsp;".bin";
&nbsp; &nbsp; &nbsp; &nbsp;&nbsp;var&nbsp;fos = Java.use("java.io.FileOutputStream").$new(path);
&nbsp; &nbsp; &nbsp; &nbsp; fos.write(data);
&nbsp; &nbsp; &nbsp; &nbsp; fos.close();
&nbsp; &nbsp; &nbsp; &nbsp;&nbsp;console.log("[*] Saved to: "&nbsp;+ path);

&nbsp; &nbsp; &nbsp; &nbsp;&nbsp;// 调用原方法进行反序列化
&nbsp; &nbsp; &nbsp; &nbsp;&nbsp;var&nbsp;result =&nbsp;this.parseFrom(data);

&nbsp; &nbsp; &nbsp; &nbsp;&nbsp;// 打印反序列化后的对象 (如果 toString 有效的话)
&nbsp; &nbsp; &nbsp; &nbsp;&nbsp;try&nbsp;{
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;&nbsp;console.log("[*] Parsed: "&nbsp;+ result.toString());
&nbsp; &nbsp; &nbsp; &nbsp; }&nbsp;catch(e) {}

&nbsp; &nbsp; &nbsp; &nbsp;&nbsp;return&nbsp;result;
&nbsp; &nbsp; };
});

注意parseFrom 有多个重载版本(byte[]CodedInputStreamInputStream 等)。如果 Hook byte[] 版本没有触发,尝试 Hook 其他重载。可以用 TargetMessage.parseFrom.overloads 查看所有重载签名。

6.3 Hook CodedOutputStream – 字段级拦截

上述方法拦截的是整个 Message 级别的序列化/反序列化。如果需要精确到每个字段的写入,可以 Hook CodedOutputStream 的各个 writeXxx 方法:

// hook_coded_output.js
// 精细拦截每个字段的写入, 直接得到 field_number、类型和值
Java.perform(function()&nbsp;{

&nbsp; &nbsp;&nbsp;var&nbsp;CodedOutputStream = Java.use("com.google.protobuf.CodedOutputStream");

&nbsp; &nbsp;&nbsp;// Hook writeString: 拦截所有 string 类型字段的写入
&nbsp; &nbsp; CodedOutputStream.writeString.implementation =&nbsp;function(fieldNumber, value)&nbsp;{
&nbsp; &nbsp; &nbsp; &nbsp;&nbsp;console.log("[PB] writeString field="&nbsp;+ fieldNumber +&nbsp;" value=\""&nbsp;+ value +&nbsp;"\"");
&nbsp; &nbsp; &nbsp; &nbsp;&nbsp;returnthis.writeString(fieldNumber, value);
&nbsp; &nbsp; };

&nbsp; &nbsp;&nbsp;// Hook writeInt32: 拦截所有 int32 类型字段的写入
&nbsp; &nbsp; CodedOutputStream.writeInt32.implementation =&nbsp;function(fieldNumber, value)&nbsp;{
&nbsp; &nbsp; &nbsp; &nbsp;&nbsp;console.log("[PB] writeInt32 field="&nbsp;+ fieldNumber +&nbsp;" value="&nbsp;+ value);
&nbsp; &nbsp; &nbsp; &nbsp;&nbsp;returnthis.writeInt32(fieldNumber, value);
&nbsp; &nbsp; };

&nbsp; &nbsp;&nbsp;// Hook writeInt64: 拦截所有 int64 类型字段的写入
&nbsp; &nbsp; CodedOutputStream.writeInt64.implementation =&nbsp;function(fieldNumber, value)&nbsp;{
&nbsp; &nbsp; &nbsp; &nbsp;&nbsp;console.log("[PB] writeInt64 field="&nbsp;+ fieldNumber +&nbsp;" value="&nbsp;+ value);
&nbsp; &nbsp; &nbsp; &nbsp;&nbsp;returnthis.writeInt64(fieldNumber, value);
&nbsp; &nbsp; };

&nbsp; &nbsp;&nbsp;// Hook writeBool: 拦截所有 bool 类型字段的写入
&nbsp; &nbsp; CodedOutputStream.writeBool.implementation =&nbsp;function(fieldNumber, value)&nbsp;{
&nbsp; &nbsp; &nbsp; &nbsp;&nbsp;console.log("[PB] writeBool field="&nbsp;+ fieldNumber +&nbsp;" value="&nbsp;+ value);
&nbsp; &nbsp; &nbsp; &nbsp;&nbsp;returnthis.writeBool(fieldNumber, value);
&nbsp; &nbsp; };

&nbsp; &nbsp;&nbsp;// Hook writeEnum: 拦截所有 enum 类型字段的写入
&nbsp; &nbsp; CodedOutputStream.writeEnum.implementation =&nbsp;function(fieldNumber, value)&nbsp;{
&nbsp; &nbsp; &nbsp; &nbsp;&nbsp;console.log("[PB] writeEnum field="&nbsp;+ fieldNumber +&nbsp;" value="&nbsp;+ value);
&nbsp; &nbsp; &nbsp; &nbsp;&nbsp;returnthis.writeEnum(fieldNumber, value);
&nbsp; &nbsp; };

&nbsp; &nbsp;&nbsp;// Hook writeBytes: 拦截所有 bytes 类型字段的写入
&nbsp; &nbsp; CodedOutputStream.writeBytes.implementation =&nbsp;function(fieldNumber, value)&nbsp;{
&nbsp; &nbsp; &nbsp; &nbsp;&nbsp;console.log("[PB] writeBytes field="&nbsp;+ fieldNumber +&nbsp;" len="&nbsp;+ value.size());
&nbsp; &nbsp; &nbsp; &nbsp;&nbsp;returnthis.writeBytes(fieldNumber, value);
&nbsp; &nbsp; };

&nbsp; &nbsp;&nbsp;// Hook writeMessage: 拦截所有嵌套 message 字段的写入
&nbsp; &nbsp; CodedOutputStream.writeMessage.implementation =&nbsp;function(fieldNumber, value)&nbsp;{
&nbsp; &nbsp; &nbsp; &nbsp;&nbsp;console.log("[PB] writeMessage field="&nbsp;+ fieldNumber +
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;&nbsp;" class="&nbsp;+ value.getClass().getName());
&nbsp; &nbsp; &nbsp; &nbsp;&nbsp;returnthis.writeMessage(fieldNumber, value);
&nbsp; &nbsp; };
});

最佳用途:这种字段级 Hook 特别适合在不知道 Message 类名的情况下使用——你不需要知道具体是哪个 Message,只需要知道所有 protobuf 字段最终都会经过 CodedOutputStream 写出。输出结果可以直接用于还原 .proto 定义(因为 writeXxx 方法名直接映射到 proto 类型)。

6.4 批量枚举所有 Protobuf Message 类

在不知道目标 Message 类名的情况下,可以枚举 APK 中所有已加载的 protobuf Message 类及其字段信息:

// enum_protobuf_classes.js
// 枚举 APK 中所有 protobuf Message 类及其 FIELD_NUMBER 常量
Java.perform(function()&nbsp;{

&nbsp; &nbsp; Java.enumerateLoadedClasses({
&nbsp; &nbsp; &nbsp; &nbsp;&nbsp;onMatch:&nbsp;function(className)&nbsp;{
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;&nbsp;try&nbsp;{
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;&nbsp;// 跳过匿名内部类和 Builder 类, 减少噪音
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;&nbsp;if&nbsp;(className.indexOf("$") !==&nbsp;-1)&nbsp;return;

&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;&nbsp;var&nbsp;clz = Java.use(className);
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;&nbsp;var&nbsp;superClass = clz.class.getSuperclass();

&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;&nbsp;if&nbsp;(superClass !=&nbsp;null) {
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;&nbsp;var&nbsp;superName = superClass.getName();
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;&nbsp;// 检查父类是否是 protobuf 的 Message 基类
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;&nbsp;if&nbsp;(superName.indexOf("GeneratedMessageLite") !==&nbsp;-1&nbsp;||
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; superName.indexOf("GeneratedMessageV3") !==&nbsp;-1&nbsp;||
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; superName.indexOf("GeneratedMessage") !==&nbsp;-1) {

&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;&nbsp;console.log("[PROTO] "&nbsp;+ className);

&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;&nbsp;// 反射获取所有 FIELD_NUMBER 常量
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;&nbsp;var&nbsp;fields = clz.class.getDeclaredFields();
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;&nbsp;for&nbsp;(var&nbsp;i =&nbsp;0; i < fields.length; i++) {
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;&nbsp;var&nbsp;name = fields[i].getName();
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;&nbsp;// 只关注以 _FIELD_NUMBER 结尾的静态常量
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;&nbsp;if&nbsp;(name.endsWith("_FIELD_NUMBER")) {
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; fields[i].setAccessible(true);
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;&nbsp;var&nbsp;val = fields[i].getInt(null);
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;&nbsp;console.log(" &nbsp;"&nbsp;+ name +&nbsp;" = "&nbsp;+ val);
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; }
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; }
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; }
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; }
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; }&nbsp;catch(e) {
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;&nbsp;// 部分类可能无法加载, 忽略
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; }
&nbsp; &nbsp; &nbsp; &nbsp; },
&nbsp; &nbsp; &nbsp; &nbsp;&nbsp;onComplete:&nbsp;function()&nbsp;{
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;&nbsp;console.log("[*] Enumeration complete");
&nbsp; &nbsp; &nbsp; &nbsp; }
&nbsp; &nbsp; });
});

输出示例

[PROTO] com.example.app.proto.UserInfo
ID_FIELD_NUMBER = 1
NAME_FIELD_NUMBER = 2
EMAIL_FIELD_NUMBER = 3
[PROTO] com.example.app.proto.LoginRequest
TOKEN_FIELD_NUMBER = 1
DEVICE_ID_FIELD_NUMBER = 2
[*] Enumeration complete

这些信息结合 writeTo 的 Hook 输出,足以还原出完整的 .proto 定义。

6.5 修改 Protobuf 数据(篡改请求)

通过 Hook protobuf 的 Builder 模式,可以在请求发出前篡改字段值——这在测试支付逻辑、权限校验等场景中非常有用:

// tamper_protobuf.js
// 篡改 protobuf 请求中的字段值 (以修改购买价格为例)
Java.perform(function()&nbsp;{

&nbsp; &nbsp;&nbsp;// Hook Builder 的 build() 方法 - 这是 Message 构造的最后一步
&nbsp; &nbsp;&nbsp;var&nbsp;Builder = Java.use("com.example.app.proto.PurchaseRequest$Builder");

&nbsp; &nbsp; Builder.build.implementation =&nbsp;function()&nbsp;{
&nbsp; &nbsp; &nbsp; &nbsp;&nbsp;// 打印原始值
&nbsp; &nbsp; &nbsp; &nbsp;&nbsp;console.log("[*] Original price: "&nbsp;+&nbsp;this.getPrice());
&nbsp; &nbsp; &nbsp; &nbsp;&nbsp;console.log("[*] Original item_id: "&nbsp;+&nbsp;this.getItemId());

&nbsp; &nbsp; &nbsp; &nbsp;&nbsp;// 篡改价格为 0
&nbsp; &nbsp; &nbsp; &nbsp;&nbsp;this.setPrice(0);
&nbsp; &nbsp; &nbsp; &nbsp;&nbsp;console.log("[*] Tampered price: "&nbsp;+&nbsp;this.getPrice());

&nbsp; &nbsp; &nbsp; &nbsp;&nbsp;// 调用原始 build() 构造篡改后的 Message
&nbsp; &nbsp; &nbsp; &nbsp;&nbsp;returnthis.build();
&nbsp; &nbsp; };
});

防御视角:这也说明了为什么服务端不能信任客户端提交的价格字段——即使使用了 protobuf 二进制编码,攻击者仍然可以通过 Frida 轻松篡改任何字段。价格、数量等敏感字段应在服务端重新计算和校验。

6.6 Frida + protoc 实时解码

将 Frida 捕获的 protobuf 二进制数据通过 send() 发送到 PC 端,由 Python 脚本接收并调用 protoc --decode_raw 实时解码——实现「边操作边解码」的实时分析体验:

设备端 Frida 脚本

// realtime_decode.js
// 捕获 protobuf 序列化数据并通过 send() 发送到 PC 端
Java.perform(function()&nbsp;{

&nbsp; &nbsp;&nbsp;var&nbsp;MessageLite = Java.use("com.google.protobuf.GeneratedMessageLite");

&nbsp; &nbsp; MessageLite.toByteArray.implementation =&nbsp;function()&nbsp;{
&nbsp; &nbsp; &nbsp; &nbsp;&nbsp;var&nbsp;result =&nbsp;this.toByteArray();
&nbsp; &nbsp; &nbsp; &nbsp;&nbsp;var&nbsp;className =&nbsp;this.getClass().getName();

&nbsp; &nbsp; &nbsp; &nbsp;&nbsp;// 将 byte[] 转为 Base64 字符串, 通过 Frida 的 send() 发送到 PC 端
&nbsp; &nbsp; &nbsp; &nbsp;&nbsp;var&nbsp;Base64 = Java.use("android.util.Base64");
&nbsp; &nbsp; &nbsp; &nbsp;&nbsp;var&nbsp;b64 = Base64.encodeToString(result,&nbsp;0); &nbsp;// 0 = NO_WRAP

&nbsp; &nbsp; &nbsp; &nbsp;&nbsp;// send() 将数据发送到 PC 端的 on_message 回调
&nbsp; &nbsp; &nbsp; &nbsp; send({
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;&nbsp;type:&nbsp;"protobuf",
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;&nbsp;class: className,
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;&nbsp;data: b64
&nbsp; &nbsp; &nbsp; &nbsp; });

&nbsp; &nbsp; &nbsp; &nbsp;&nbsp;return&nbsp;result;
&nbsp; &nbsp; };
});

PC 端 Python 接收脚本

import&nbsp;frida
import&nbsp;base64
import&nbsp;subprocess
import&nbsp;sys

def&nbsp;on_message(message: dict, data: bytes)&nbsp;->&nbsp;None:
&nbsp; &nbsp;&nbsp;"""
&nbsp; &nbsp; Frida 消息回调: 接收设备端发来的 protobuf 数据并实时解码。
&nbsp; &nbsp; """
&nbsp; &nbsp;&nbsp;if&nbsp;message['type'] ==&nbsp;'send'and&nbsp;message['payload'].get('type') ==&nbsp;'protobuf':
&nbsp; &nbsp; &nbsp; &nbsp; cls = message['payload']['class']
&nbsp; &nbsp; &nbsp; &nbsp; raw = base64.b64decode(message['payload']['data'])

&nbsp; &nbsp; &nbsp; &nbsp;&nbsp;# 使用 protoc --decode_raw 进行裸解码
&nbsp; &nbsp; &nbsp; &nbsp; result = subprocess.run(
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; ['protoc',&nbsp;'--decode_raw'],
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; input=raw,
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; capture_output=True,
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; text=True,
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; timeout=5
&nbsp; &nbsp; &nbsp; &nbsp; )

&nbsp; &nbsp; &nbsp; &nbsp;&nbsp;# 格式化输出
&nbsp; &nbsp; &nbsp; &nbsp; print(f"\n{'='&nbsp;*&nbsp;60}")
&nbsp; &nbsp; &nbsp; &nbsp; print(f"Class:&nbsp;{cls}")
&nbsp; &nbsp; &nbsp; &nbsp; print(f"Size:&nbsp;{len(raw)}&nbsp;bytes")
&nbsp; &nbsp; &nbsp; &nbsp; print(f"Decoded:\n{result.stdout}")

&nbsp; &nbsp;&nbsp;elif&nbsp;message['type'] ==&nbsp;'error':
&nbsp; &nbsp; &nbsp; &nbsp;&nbsp;# Frida 脚本运行时错误
&nbsp; &nbsp; &nbsp; &nbsp; print(f"[ERROR]&nbsp;{message['stack']}")

def&nbsp;main()&nbsp;->&nbsp;None:
&nbsp; &nbsp;&nbsp;# 连接 USB 设备并附加到目标进程
&nbsp; &nbsp; device = frida.get_usb_device()
&nbsp; &nbsp; session = device.attach("com.example.app") &nbsp;# 替换为目标包名

&nbsp; &nbsp;&nbsp;# 加载 Frida 脚本
&nbsp; &nbsp;&nbsp;with&nbsp;open("realtime_decode.js")&nbsp;as&nbsp;f:
&nbsp; &nbsp; &nbsp; &nbsp; script = session.create_script(f.read())

&nbsp; &nbsp; script.on('message', on_message)
&nbsp; &nbsp; script.load()

&nbsp; &nbsp; print("[*] Listening for protobuf messages... Press Ctrl+C to quit.")
&nbsp; &nbsp;&nbsp;try:
&nbsp; &nbsp; &nbsp; &nbsp; sys.stdin.read() &nbsp;# 保持运行, 等待消息
&nbsp; &nbsp;&nbsp;except&nbsp;KeyboardInterrupt:
&nbsp; &nbsp; &nbsp; &nbsp; session.detach()
&nbsp; &nbsp; &nbsp; &nbsp; print("\n[*] Detached.")

if&nbsp;__name__ ==&nbsp;'__main__':
&nbsp; &nbsp; main()

进阶优化:可以在 PC 端脚本中用 blackboxprotobuf 替代 protoc --decode_raw,并维护一个 typedef 映射表,实现带字段名的实时解码。还可以将解码结果写入 SQLite 或 JSON 文件,方便后续批量分析。


七、常见对抗与绕过

在实际逆向中,应用可能采取各种措施来增加 protobuf 分析的难度。以下是常见的对抗手段及对应的绕过策略。

7.1 自定义序列化

部分应用不使用标准 protobuf 库(com.google.protobuf.*),而是自行实现 protobuf 的编解码逻辑:

// 自定义的轻量 protobuf 编码 (不依赖 Google protobuf 库)
public&nbsp;byte[] encode() {
&nbsp; &nbsp; ByteArrayOutputStream bos =&nbsp;new&nbsp;ByteArrayOutputStream();
&nbsp; &nbsp;&nbsp;// 手动构造 Tag: (field_number << 3) | wire_type
&nbsp; &nbsp; writeVarint(bos, (1&nbsp;<<&nbsp;3) |&nbsp;0); &nbsp;// field 1, wire_type=0 (Varint)
&nbsp; &nbsp; writeVarint(bos,&nbsp;this.userId);
&nbsp; &nbsp; writeVarint(bos, (2&nbsp;<<&nbsp;3) |&nbsp;2); &nbsp;// field 2, wire_type=2 (Length-delimited)
&nbsp; &nbsp; writeBytes(bos,&nbsp;this.name.getBytes("UTF-8"));
&nbsp; &nbsp;&nbsp;return&nbsp;bos.toByteArray();
}

绕过策略

Wire Format 编码规范是公开标准,自定义实现必须遵循同样的编码规则(否则服务端无法解码)。因此:

  • 搜索代码中的 writeVarint<< 3& 0x07 等特征操作,定位自定义编码逻辑
  • 数据层面完全不变,protoc --decode_raw 仍然能正常解码
  • Hook 自定义的 encode() / decode() 方法即可捕获数据

7.2 外层加密/压缩

很多应用会在 protobuf 序列化之后、发送之前,对数据进行压缩和/或加密:

数据流: [原始 protobuf] → [gzip/zstd 压缩] → [AES/ChaCha20 加密] → [网络发送]
解码流: [网络接收] → [解密] → [解压] → [protobuf 反序列化]

绕过策略

关键思路是找到加密前/解密后的节点进行 Hook,确保拿到的是明文 protobuf 数据:

  1. Hook protobuf 层(最可靠):在 writeTo / toByteArray / parseFrom 层面 Hook——此时数据一定是明文 protobuf,无论外层套了多少层加密压缩
  2. Hook 压缩层:Hook GZIPOutputStream.write() / GZIPInputStream.read() 捕获压缩前/解压后的数据
  3. Hook 加密层:Hook Cipher.doFinal() 捕获加密前/解密后的数据
  4. 逐层剥离:如果不确定加密/压缩的具体实现,可以从网络层(OkHttp Interceptor)开始,逐步向内层 Hook,直到拿到可被 protoc --decode_raw 成功解码的数据

7.3 字段混淆

部分代码混淆工具会对 .proto 中的字段名进行混淆(将有意义的字段名替换为 abc),但 field number 和 wire type 无法被混淆——因为它们是编码在二进制数据中的,改变它们会导致服务端无法解码:

// 混淆前 (原始 .proto)
message&nbsp;UserInfo&nbsp;{
&nbsp;&nbsp;string&nbsp;username =&nbsp;1;
&nbsp;&nbsp;int32&nbsp;age =&nbsp;2;
}

// 混淆后 (字段名被替换, 但 field number 不变)
message&nbsp;a&nbsp;{
&nbsp;&nbsp;string&nbsp;a =&nbsp;1; &nbsp;// field number 仍然是 1
&nbsp;&nbsp;int32&nbsp;b =&nbsp;2; &nbsp;&nbsp;// field number 仍然是 2
}

对逆向的影响

  • Wire Format 编码完全相同,不影响数据解码
  • 丢失了有意义的字段名,需要通过业务语义推断
  • 可结合多样本对比分析(中篇 5.5 节)和 UI 操作关联来还原字段名

7.4 Protobuf Nano / Lite 无 Descriptor

Android 应用最常用的是 protobuf-lite 或已废弃的 protobuf-nano,它们为了减小 APK 体积,不包含 descriptor 信息。这意味着:

  • 无法通过 descriptor 自动还原 .proto(中篇 5.3 节的方法不适用)
  • PBTK 等自动化工具可能失效

绕过策略:只能通过分析 writeTo / mergeFrom 方法手动还原(中篇 5.1 节的方法)。虽然工作量更大,但还原结果是最精确的。

如何判断是 lite 还是完整版

  • 完整版:包含 com.google.protobuf.DescriptorsFileDescriptorgetDescriptor() 等类和方法
  • Lite 版:只有 com.google.protobuf.GeneratedMessageLite,不包含 Descriptor 相关类
  • Nano 版:使用 com.google.protobuf.nano.MessageNano 基类(已废弃,但存量应用仍在)

7.5 Native 层 Protobuf

当 protobuf 逻辑在 .so 文件中实现(C++ protobuf 库)时,Java 层的 Hook 方法不再适用,需要转向 native 层分析:

静态分析

# 搜索动态符号表中的 protobuf 相关符号
nm -D libnative.so | grep -i protobuf
# 搜索 ELF 符号表
readelf -s libnative.so | grep -i protobuf

# 在 IDA Pro / Ghidra 中搜索的关键符号:
# google::protobuf::MessageLite::SerializeToString
# google::protobuf::MessageLite::ParseFromString
# google::protobuf::io::CodedOutputStream::WriteTag
# google::protobuf::io::CodedInputStream::ReadTag

Frida Hook native 层 protobuf

// hook_native_protobuf.js
// Hook C++ protobuf 的 SerializeToString 方法
Interceptor.attach(Module.findExportByName("libnative.so",
&nbsp; &nbsp;&nbsp;// C++ mangled name, 对应 google::protobuf::MessageLite::SerializeToString(std::string*)
&nbsp; &nbsp;&nbsp;"_ZN6google8protobuf11MessageLite19SerializeToStringEPNSt3__112basic_stringIcNS2_11char_traitsIcEENS2_9allocatorIcEEEE"), {

&nbsp; &nbsp;&nbsp;onEnter:&nbsp;function(args)&nbsp;{
&nbsp; &nbsp; &nbsp; &nbsp;&nbsp;// args[0] = this (MessageLite* 指针)
&nbsp; &nbsp; &nbsp; &nbsp;&nbsp;// args[1] = output (std::string* 指针)
&nbsp; &nbsp; &nbsp; &nbsp;&nbsp;this.msg = args[0];
&nbsp; &nbsp; &nbsp; &nbsp;&nbsp;this.str = args[1];
&nbsp; &nbsp; },

&nbsp; &nbsp;&nbsp;onLeave:&nbsp;function(retval)&nbsp;{
&nbsp; &nbsp; &nbsp; &nbsp;&nbsp;// 读取 std::string 的内部数据
&nbsp; &nbsp; &nbsp; &nbsp;&nbsp;// std::string 内存布局 (libc++): [指针, 长度, 容量]
&nbsp; &nbsp; &nbsp; &nbsp;&nbsp;var&nbsp;stdString =&nbsp;this.str;
&nbsp; &nbsp; &nbsp; &nbsp;&nbsp;var&nbsp;data = Memory.readPointer(stdString); &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;&nbsp;// 数据指针
&nbsp; &nbsp; &nbsp; &nbsp;&nbsp;var&nbsp;size = Memory.readULong(stdString.add(Process.pointerSize)); &nbsp;// 长度

&nbsp; &nbsp; &nbsp; &nbsp;&nbsp;if&nbsp;(size >&nbsp;0&nbsp;&& size <&nbsp;10240) {
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;&nbsp;var&nbsp;buf = Memory.readByteArray(data, size);
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;&nbsp;console.log("[Native PB] size="&nbsp;+ size);
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;&nbsp;// hexdump 输出, 限制最多 256 字节
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;&nbsp;console.log(hexdump(buf, {length:&nbsp;Math.min(size,&nbsp;256)}));

&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;&nbsp;// 也可以保存到文件供 PC 端分析
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;&nbsp;var&nbsp;path =&nbsp;"/data/local/tmp/native_pb_"&nbsp;+&nbsp;Date.now() +&nbsp;".bin";
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;&nbsp;var&nbsp;file =&nbsp;new&nbsp;File(path,&nbsp;"wb");
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; file.write(buf);
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; file.flush();
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; file.close();
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;&nbsp;console.log("[Native PB] Saved to: "&nbsp;+ path);
&nbsp; &nbsp; &nbsp; &nbsp; }
&nbsp; &nbsp; }
});

C++ 符号名查找技巧:如果 .so 文件没有被 strip,可以用 nm -D 直接搜索符号。如果被 strip 了,可以在 IDA/Ghidra 中通过字符串交叉引用(如 "SerializeToString" 错误信息)来定位函数。另外,c++filt 工具可以将 mangled name(如 _ZN6google8protobuf...)还原为可读的 C++ 签名。


八、工具速查表

8.1 解码工具

| 工具 | 用途 | 安装方式 | 适用场景 | | — | — | — | — | | protoc --decode_raw | 命令行裸解码 | brew install protobuf | 快速验证数据是否为 protobuf | | blackboxprotobuf | Python 交互式解码,支持类型修正和编码 | pip install blackboxprotobuf | 逐步还原字段类型,构造篡改请求 | | protobuf-inspector | 彩色层级化终端输出 | pip install protobuf-inspector | 快速浏览复杂嵌套结构 | | pbtk | 综合工具包:提取、解码、编辑 | git clone from GitHub | 从 APK 自动提取并解码 |

8.2 .proto 还原工具

| 工具 | 用途 | 适用条件 | | — | — | — | | PBTK | 从 APK/JAR 自动提取 .proto | 应用使用完整版 protobuf-java | | protodec | 从 descriptor 反编译 .proto | 已获取 .desc 描述符文件 | | jadx | Java 反编译,分析 writeTo 方法 | 通用,但需手动分析 | | Ghidra / IDA Pro | 分析 native 层 protobuf | protobuf 逻辑在 .so 中实现 | | grpcurl | 通过 gRPC Reflection 获取 service 定义 | 服务端开启了 Reflection |

8.3 动态分析工具

| 工具 | 用途 | 说明 | | — | — | — | | Frida | Hook Java/Native protobuf 调用 | 最灵活,支持实时捕获和篡改 | | mitmproxy | 抓包 + 自定义 protobuf 解码脚本 | 支持 Python 脚本扩展,可自动解码 | | Charles / Burp Suite | 配合插件解码 protobuf 流量 | GUI 友好,适合手动分析 | | Wireshark | 分析 gRPC/protobuf 网络层细节 | 支持 protobuf dissector 插件 |

8.4 完整逆向流程图 Protobuf 完整逆向流程


附录:实战 Cheat Sheet

以下是日常逆向中最常用的命令和代码片段,建议收藏备用:

# ============================================================
# 1. 快速判断抓包数据是否是 protobuf
# ============================================================
echo&nbsp;-n&nbsp;"YOUR_HEX_DATA"&nbsp;| xxd -r -p | protoc --decode_raw

# ============================================================
# 2. Base64 编码的 protobuf 解码
# ============================================================
echo"BASE64_DATA"&nbsp;| base64 -d | protoc --decode_raw

# ============================================================
# 3. gRPC 数据解码 (跳过前 5 字节的 gRPC 帧头)
# ============================================================
dd&nbsp;if=grpc_body.bin bs=1 skip=5 | protoc --decode_raw

# ============================================================
# 4. 搜索 APK 中的 protobuf 类 (jadx 反编译后)
# ============================================================
jadx -d output/ target.apk
grep -r&nbsp;"GeneratedMessageLite\|GeneratedMessageV3\|FIELD_NUMBER"&nbsp;output/

# ============================================================
# 5. 搜索 APK 中残留的 .proto / descriptor 文件
# ============================================================
unzip -l target.apk | grep -iE&nbsp;"\.proto$|\.desc$|\.pb$"

# ============================================================
# 6. 使用 blackboxprotobuf 快速解码二进制文件
# ============================================================
python3 -c&nbsp;"
import blackboxprotobuf, sys
data = open(sys.argv[1], 'rb').read()
msg, td = blackboxprotobuf.decode_message(data)
print(msg)
"&nbsp;captured.bin

# ============================================================
# 7. Frida 一键 Hook protobuf (附加到目标进程)
# ============================================================
frida -U -l hook_protobuf_writeto.js com.target.app

# ============================================================
# 8. 从 Frida 保存的 .bin 文件批量解码
# ============================================================
for&nbsp;f&nbsp;in&nbsp;/data/local/tmp/pb_*.bin;&nbsp;do
&nbsp; &nbsp;&nbsp;echo"===&nbsp;$f&nbsp;==="
&nbsp; &nbsp; protoc --decode_raw <&nbsp;"$f"
&nbsp; &nbsp;&nbsp;echo""
done

# ============================================================
# 9. 使用 grpcurl 探测 gRPC 服务 (需服务端开启 Reflection)
# ============================================================
# 列出所有服务
grpcurl -plaintext localhost:50051 list
# 列出服务的所有方法
grpcurl -plaintext localhost:50051 list com.example.UserService
# 描述消息结构
grpcurl -plaintext localhost:50051 describe com.example.UserRequest

# ============================================================
# 10. 将 .desc 描述符文件反编译为 .proto
# ============================================================
protoc --descriptor_set_in=descriptors.desc \
&nbsp; --decode=google.protobuf.FileDescriptorSet \
&nbsp; google/protobuf/descriptor.proto

系列总结

经过三篇文章,我们完成了 Android 逆向视角下 Protobuf 协议分析的完整知识体系:

上篇(基础理论)建立了认知基础:理解 protobuf 为何高效、Wire Format 的 Tag+Value 编码结构、Varint 和 ZigZag 变长整数编码,以及从 APK 类名到 HTTP 流量的多维识别方法。这是一切后续操作的理论依据。

中篇(解码与还原)解决了「能看懂」的问题:从 protoc --decode_raw 裸解码到 blackboxprotobuf 交互式类型修正,从分析 writeTo 方法精确还原 .proto,到处理混淆代码、提取 Descriptor、盲猜字段语义。无论手头有多少信息,总有一种方法能推进分析。

下篇(实战)解决了「能动手」的问题:Frida Hook 从 Message 级到字段级的全方位拦截,实时捕获、修改和回放 protobuf 数据;针对自定义序列化、外层加密、字段混淆、无 Descriptor、Native 层实现等对抗手段,提供了对应的绕过思路。

核心结论:Protobuf 逆向的核心在于理解 Wire Format 编码。无论应用如何混淆和加密,protobuf 数据最终都必须遵循 Tag(field_number + wire_type) + Value 的编码格式。通过静态分析 writeTo 方法可以精确还原 .proto 定义,通过 Frida 动态 Hook 可以实时捕获和篡改数据。两者结合,protobuf 协议不再是黑盒。掌握了这些方法论和工具,面对任何使用 protobuf 的 Android 应用都能游刃有余地完成协议分析。


免责声明:

本文所载程序、技术方法仅面向合法合规的安全研究与教学场景,旨在提升网络安全防护能力,具有明确的技术研究属性。

任何单位或个人未经授权,将本文内容用于攻击、破坏等非法用途的,由此引发的全部法律责任、民事赔偿及连带责任,均由行为人独立承担,本站不承担任何连带责任。

本站内容均为技术交流与知识分享目的发布,若存在版权侵权或其他异议,请通过邮件联系处理,具体联系方式可点击页面上方的联系我

本文转载自:泡泡以安 泡泡以安 泡泡以安《Android 逆向视角下的 Protobuf 协议分析(下篇):Frida Hook、对抗绕过与工具速查》

评论:0   参与:  0