The NT/amd64 ABI speaks of function prologues and epilogues, and the rest of the function.
Epilogues might not be what you think and prologues almost definitely are not what you think.
First, the easier clarification, is that a function an have any number of epilogues.
It can have zero epilogues, it can have one epilogue at the end, it can have
one epilogue not at the end, and it can have any number of epilogues.
An epilogue is not the code located at the end of the function,
it is the last thing a function runs -- it is about dynamic execution,
not static location.
A function will have zero epilogues if it never returns:
type no_epilogue.c
cl /LD /O2 /GL /GS- no_epilogue.c /link /incremental:no /export:no_epilogue /nod /noentry
link /dump /disasm no_epilogue.dll
C:\> type no_epilogue.c
cl /LD /O2 /GL /GS- no_epilogue.c /link /incremental:no /export:no_epilogue /nod /noentry
link /dump /disasm no_epilogue.dll
void no_epilogue(void (*f)()) { while (1) f(); }
Microsoft (R) C/C++ Optimizing Compiler Version 19.00.24215.1 for x64
0000000180001000: 40 53 push rbx
0000000180001002: 48 83 EC 20 sub rsp,20h
0000000180001006: 48 8B D9 mov rbx,rcx
0000000180001009: 0F 1F 80 00 00 00 nop dword ptr [rax+0000000000000000h]
00
0000000180001010: FF D3 call rbx
0000000180001012: EB FC jmp 0000000180001010
A function can have multiple epilogues if it has an "early return":
C:\> type multiple_epilogues.c
cl /LD /O2 /GL /GS- multiple_epilogues.c /link /incremental:no /export:multiple_epilogues /nod /noentry
link /dump /disasm multiple_epilogues.dll
int multiple_epilogues(int i, int (*f)(void), int (*g)(void))
{
if (i)
return f();
return g() + g() + g();
}
Microsoft (R) C/C++ Optimizing Compiler Version 19.00.24215.1 for x64
0000000180001000: push rdi
0000000180001002: sub rsp,20h
0000000180001006: mov rdi,r8
0000000180001009: test ecx,ecx
000000018000100B: je 0000000180001015
000000018000100D: add rsp,20h <== possibly epilog
0000000180001011: pop rdi <== epilog
0000000180001012: jmp rdx <== epilog
0000000180001015: mov qword ptr [rsp+30h],rbx
000000018000101A: call rdi
000000018000101C: mov ebx,eax
000000018000101E: call rdi
0000000180001020: add ebx,eax
0000000180001022: call rdi
0000000180001024: add eax,ebx
0000000180001026: mov rbx,qword ptr [rsp+30h]
000000018000102B: add rsp,20h <== possibly epilog
000000018000102F: pop rdi <== epilog
0000000180001030: ret <== epilog
And this multiple epiloge case accidentally demonstrates the next point.
Just as epilogue is not instructions located at the end of a function,
prologue is not instructions located at the start of a function.
The prologue *instructions* are the instructions that save nonvolatile
registers, or adjust rsp (prior to frame pointer establishment -- not alloca),
or establish the frame pointer (mov x, rsp).
The prologue instructions can and are interleaved with somewhat arbitrary
other instructions. The critical requirement is that nonvolatiles be saved
before nonvolatiles are changed -- as well as recording rsp adjustment
and frame pointer establishment -- such recording being a function
of executing the instruction marked as such in the "xdata".
The multi-prologue example above has such "dispersed" prologue.
Let's look at it again in more detail:
C:\> link /dump /unwindinfo /disasm multiple_epilogues.dll
Microsoft (R) COFF/PE Dumper Version 14.00.24215.1
0180001000: push rdi <=== prologue instruction, unsurprising
0180001002: sub rsp,20h <=== prologue instruction, unsurprising
0180001006: mov rdi,r8
0180001009: test ecx,ecx
018000100B: je 0000000180001015
018000100D: add rsp,20h
0180001011: pop rdi
0180001012: jmp rdx
0180001015: mov qword ptr [rsp+30h],rbx <=== also a prologue instruction
018000101A: call rdi <=== offset 1A in the unwind info below
018000101C: mov ebx,eax
018000101E: call rdi
0180001020: add ebx,eax
0180001022: call rdi
0180001024: add eax,ebx
0180001026: mov rbx,qword ptr [rsp+30h]
018000102B: add rsp,20h
018000102F: pop rdi
0180001030: ret
Function Table (1)
Begin End Info Function Name
00000000 00001000 00001031 0000208C
Unwind version: 1
Unwind flags: None
Size of prologue: 0x1A <== This is also telling.
Count of codes: 4
Unwind codes:
1A: SAVE_NONVOL, register=rbx offset=0x30
06: ALLOC_SMALL, size=0x20
02: PUSH_NONVOL, register=rdi
/*\
*
*
* look here
The critical information we want to look at is the left most column
of the unwind codes. These are the offsets just after prologue instructions.
They are reverse sorted by offset, and the underlying data is not fixed
size per line shown -- you must always walk them linearly from the start.
Offset 2 and 6 are what you expect -- the first two instructions.
But offset 1A is quite a bit into the function -- that is a bit surprising when you first see it.
And another thing. While the specification is that the offsets are just after the instruction that does the nonvolatile save, etc., the requirement and reality are looser. The offset can be later than the save, as long as it is before a change. As well, the location of a save might change between the save and the recorded offset. For example, the compiler will move nonvolatiles into home space, and then adjust rsp, and then or at the same place record that the nonvolatile was saved.
This can be achieved with the multiple_prologue.c example just by compiling with /O1 instead of /O2. Let's see:
cl /LD /O1 /GL /GS- multiple_epilogues.c /link /incremental:no /export:multiple_epilogues /nod /noentry
link /dump /disasm /unwindinfo multiple_epilogues.dll
Microsoft (R) C/C++ Optimizing Compiler Version 19.00.24215.1 for x64
Microsoft (R) COFF/PE Dumper Version 14.00.24215.1
0180001000: mov qword ptr [rsp+8],rbx <==== rbx saved here
0180001005: push rdi
0180001006: sub rsp,20h <==== but recorded here
018000100A: mov rdi,r8
018000100D: test ecx,ecx
018000100F: je 0000000180001015
0180001011: call rdx
0180001013: jmp 0000000180001021
0180001015: call rdi
0180001017: mov ebx,eax
0180001019: call rdi
018000101B: add ebx,eax
018000101D: call rdi
018000101F: add eax,ebx
0180001021: mov rbx,qword ptr [rsp+30h]
0180001026: add rsp,20h
018000102A: pop rdi
018000102B: ret
Function Table (1)
Begin End Info Function Name
00000000 00001000 0000102C 0000208C
Unwind version: 1
Unwind flags: None
Size of prologue: 0x0A
Count of codes: 4
Unwind codes:
0A: SAVE_NONVOL, register=rbx offset=0x30 <=== rbx save
0A: ALLOC_SMALL, size=0x20 <=== two unwind codes with same offset
06: PUSH_NONVOL, register=rdi
And see how rbx is saved at rsp+8 but recorded as rsp+30, because
rsp changes by 28 between the save and the recorded position.
And this is all ok. If you take an exception between the save and recorded
position of the save, rbx has not been changed, and need not be restored.
Such an exception is rare -- maybe stack overflow -- but the ABI accounts
for exceptions and stack walks from arbitrary instructions.