本来想拟的标题是"IDA sp-analysis failed F5失败",因为网上有很多这种类型为标题的讨论。可是,我实际了一把,发现出现这种现象至少是有2个原因导致,就有了这篇文章的标题。
先来看看产生“sp-analysis failed”现象的原因,
Problem: Failed to trace the value of the stack pointer
Description:
The value of the stack pointer at the end of the function is different
from its value at the start of the function. IDA checks for the
difference only if the function is ended by a "return" instruction.
The most probable cause is that stack tracing has failed.
This problem is displayed in the disassembly listing with
the "sp-analysis failed" comment.
What to do:
1. Examine the value of stack pointer at various
locations of the function and try to find out why the stack tracing
has failed. Usually, it fails because some called function changed the
stack pointer (by purging the input parameters, for example)
2. If you have found the offending function, change its
attributes (namely, number of bytes purged upon return).
3. Another way is to specify manually how the stack pointer is
modified. See Change stack pointer command
大意是ida检测到,IDA有栈跟踪的功能,它在函数内部遇到ret(retn)指令时会做判断:栈指针的值在函数的开头/结尾是否一致,如果不一致就会在函数的结尾标注"sp-analysis failed"
。一般编程中,不同的函数调用约定(如stdcall&_cdcel call)可能会出现这种情况;另外,为了实现代码保护而加入代码混淆(特指用push/push+ret实现函数调用)技术也会出现这种情况。如下面的示例代码,我在代码中加入了代码混淆。
#include <stdio.h>
//会出现"sp analysis failed",但仍能F5反编译的代码
void func1()
{
printf("func1\n");
}
void func2()
{
printf("func2\n");
}
void func3()
{
printf("func3\n");
}
void func4()
{
printf("func4\n");
}
int main()
{
__asm
{
mov eax,[ebp+4]
push eax;
lea eax,func4;
push eax;
lea eax,func3;
push eax;
lea eax,func2;
push eax;
lea eax,func1;
push eax;
ret;
}
}
ida加载exe后,出现了"
sp-analysis failed
"的提示,但是,F5后ida仍然能还原出源码,如图1:
图1
图1中,红色高亮注释部分即为"sp-analysis failed",虽然出现了这样的字样,IDA还是很靠谱的还原出main函数的原型,如图2
图2
由此可见,出现"sp analysis failed"并不是导致F5不能反编译的元凶。那我们接着分析可能是什么原因导致反编译失败?
我参考了几篇文章,如:"让IDA的F5 插件失效"和"IDA sp-analysis failed不能F5的解决方案",总结起来,如果代码中掺杂着花指令,就有可能影响ida的反编译输出。如下面代码所示:
#include <stdio.h>
void func1()
{
__asm
{
lea eax,lab1;
jmp eax
_emit 0x90;
lab1:
}
printf("func1\n");
}
void func2()
{
__asm
{
cmp eax,ecx;
jnz lab1;
jz lab1;
_emit 0xB8;
lab1:
}
printf("func2\n");
}
int main()
{
func1();
func2();
return 0;
}
先说func2。很明显func2的__asm{}块中含有不太复杂的花指令,由于它的存在,干扰了IDA反编译生成伪代码,如图3:
图3
图3中包含了很多信息:func2的"Line prefixes"部分是红色的,这种情况下F5反编译源码会失败。所以,在func2代码区内按下F5,出现警告提示框"please position the cursor within a function"。虽然,花指令成功的阻止IDA反编译源码,但还是可以通过ida的Edit-Undefine&Edit-Code功能去除这种影响:
1).首先,选中从0x040109E-0x04010D0之间的代码段,点击Edit-Undefine,取消现有的错误的函数定义;
2).接着,选中从0x040109F-0x04010D0之间的数据,点击Edit-Code,重新生成代码。因为0x040109E这个字节是干扰字节,所以重新生成代码时要跳过这个字节。
3).选中func2和上一步生成的函数,点击Edit-function-"Create Function",生成完整的func2函数。如下:
.text:00401080 ; =============== S U B R O U T I N E =======================================
.text:00401080
.text:00401080 ; Attributes: bp-based frame
.text:00401080
.text:00401080 ; void __cdecl func2(void)
.text:00401080 ?func2@@YAXXZ proc near ; CODE XREF: func2(void)j
.text:00401080
.text:00401080 var_40 = byte ptr -40h
.text:00401080
.text:00401080 push ebp
.text:00401081 mov ebp, esp
.text:00401083 sub esp, 40h
.text:00401086 push ebx
.text:00401087 push esi
.text:00401088 push edi
.text:00401089 lea edi, [ebp+var_40]
.text:0040108C mov ecx, 10h
.text:00401091 mov eax, 0CCCCCCCCh
.text:00401096 rep stosd
.text:00401098 cmp eax, ecx
.text:0040109A jnz short loc_40109F
.text:0040109C jz short loc_40109F
.text:0040109C ; ---------------------------------------------------------------------------
.text:0040109E db 0B8h ;
.text:0040109F ; ---------------------------------------------------------------------------
.text:0040109F
.text:0040109F loc_40109F: ; CODE XREF: func2(void)+1Aj
.text:0040109F ; func2(void)+1Cj
.text:0040109F push offset aFunc2 ; "func2\n"
.text:004010A4 call _printf
.text:004010A9 add esp, 4
.text:004010AC pop edi
.text:004010AD pop esi
.text:004010AE pop ebx
.text:004010AF add esp, 40h
.text:004010B2 cmp ebp, esp
.text:004010B4 call __chkesp
.text:004010B9 mov esp, ebp
.text:004010BB pop ebp
.text:004010BC retn
.text:004010BC ?func2@@YAXXZ endp
4).F5,反编译生成源码,如图4。
图4
最后,我们看下func1。ida加载后,由于无条件跳转语句jmp的存在,func1被ida硬生生的拆成2个函数:不完整的func1和sub_401051(怎么有一种棒打鸳鸯的感觉?)。
.text:00401030 ; =============== S U B R O U T I N E =======================================
.text:00401030
.text:00401030 ; Attributes: bp-based frame
.text:00401030
.text:00401030 ; void __cdecl func1()
.text:00401030 ?func1@@YAXXZ proc near ; CODE XREF: func1(void)j
.text:00401030
.text:00401030 var_40 = byte ptr -40h
.text:00401030
.text:00401030 push ebp
.text:00401031 mov ebp, esp
.text:00401033 sub esp, 40h
.text:00401036 push ebx
.text:00401037 push esi
.text:00401038 push edi
.text:00401039 lea edi, [ebp+var_40]
.text:0040103C mov ecx, 10h
.text:00401041 mov eax, 0CCCCCCCCh
.text:00401046 rep stosd
.text:00401048 lea eax, sub_401051
.text:0040104E jmp eax
.text:0040104E ?func1@@YAXXZ endp
.text:0040104E
.text:0040104E ; ---------------------------------------------------------------------------
.text:00401050 db 90h
.text:00401051
.text:00401051 ; =============== S U B R O U T I N E =======================================
.text:00401051
.text:00401051
.text:00401051 sub_401051 proc near ; DATA XREF: func1(void)+18o
.text:00401051 push offset aFunc1 ; "func1\n"
.text:00401056 call _printf
.text:0040105B add esp, 4
.text:0040105E pop edi
.text:0040105F pop esi
.text:00401060 pop ebx
.text:00401061 add esp, 40h
.text:00401064 cmp ebp, esp
.text:00401066 call __chkesp
.text:0040106B mov esp, ebp
.text:0040106D pop ebp
.text:0040106E retn
.text:0040106E sub_401051 endp ; sp-analysis failed
不完整func1可以被反编译,生成如下的代码:
void __cdecl func1()
{
char v0; // [sp+Ch] [bp-40h]@1
memset(&v0, 0xCCu, 0x40u);
JUMPOUT(sub_401051); //<----跳转的目的地是func1的另一半:sub_401051函数
}
上面的代码中有段奇怪的宏:JUMPOUT()----ida手册给出了JUMPOUT的解释,当函数内部有跳出函数体的语句时,ida就会生成JUMPOUT宏:
The decompiler has a configuration file. It is installed into the 'cfg' subdirectory of the IDA installation.
The configuration file is named 'hexrays.cfg'.
...
HO_JUMPOUT_HELPERS
If enabled, the decompiler will handle out-of-function jumps by generating a call to the JUMPOUT() function.
If disables, such functions will not be decompiled.
Default: enabled
结合ida生成的伪代码和手册,我们可以发现,虽然func1被拆分成2个函数,但是这两个并不是孤立的,而是通过jmp语句紧密联系在一起。我们顺藤摸瓜,来到ida分析得到的函数sub_401051。这个函数可以说是一个问题函数:1).函数尾部具有"sp-analysis failed"注释;2).函数不能被F5。
被ida一分为二的函数中的后一个函数往往有这样问题,如这篇求助帖:"Hex-Rays: JUMPOUT statements inserted due to incorrect autodetected function boundaries"。
勾选ida的Options-General-Disassembly-"Stack pointer"后,可以看到sub_401051函数的堆栈指针的确<0,这是因为原本在func1中执行恢复堆栈的指令全移到了sub_401051中导致。
解决方案如下:
选中地址text:0040104E, 点击Edit-Functions-“Delete function”,然后,选择地址.text:0040106E,点击Edit-Functions-“Set function End“。最终生成的代码如下:
.text:00401030 ; void __cdecl func1()
.text:00401030 ?func1@@YAXXZ: ; CODE XREF: func1(void)j
.text:00401030 push ebp
.text:00401031 mov ebp, esp
.text:00401033 sub esp, 40h
.text:00401036 push ebx
.text:00401037 push esi
.text:00401038 push edi
.text:00401039 lea edi, [ebp-40h]
.text:0040103C mov ecx, 10h
.text:00401041 mov eax, 0CCCCCCCCh
.text:00401046 rep stosd
.text:00401048 lea eax, loc_401051
.text:0040104E jmp eax
.text:0040104E ; ---------------------------------------------------------------------------
.text:00401050 db 90h
.text:00401051 ; ---------------------------------------------------------------------------
.text:00401051
.text:00401051 loc_401051: ; DATA XREF: func1(void)+39o
.text:00401051 push offset aFunc1 ; "func1\n"
.text:00401056 call _printf
.text:0040105B add esp, 4
.text:0040105E pop edi
.text:0040105F pop esi
.text:00401060 pop ebx
.text:00401061 add esp, 40h
.text:00401064 cmp ebp, esp
.text:00401066 call __chkesp
.text:0040106B mov esp, ebp
.text:0040106D pop ebp
.text:0040106E retn
.text:0040106E j_?func1@@YAXXZ endp
这样,不仅消除了"sp-analysis failed"的注释,还能F5反编译。
本篇完,写了好多。
参考:
<Hex-Rays: JUMPOUT statements inserted due to incorrect autodetected function boundaries>
IDA sp-analysis failed 不能F5的 解决方案之(一)
让IDA的F5插件失效
Fixing the stackpointer in IDA when exception handlers are used