This is well documented in the manuals.
x86 instructions look like this:
optional prefix bytes
opcode bytes
modrm/sib
displacement
immediate
The maximum size is 15 bytes.
Prefix bytes include segment overrides, size overrides, lock prefix, repe/repne.
The presence of modrm and immediate is dependent on the opcode.
The presence of displacement depends on opcode and/or modrm.
Some opcodes are one byte, like push/pop.
Some opcodes have implied register use, like push/pop.
Some opcodes have no modrm/sib but do have displacement, like jmp/call.
Many opcodes have no immediate. An example that does have immediate is add.
Let's dig into modrm/sib.
modrm is a byte with three fields.
two bit mode, let's call it mod.
three bit reg, let's call it r
three bit reg or memory, let's call it r/m
The layout is left to right, so the two bit mode 0-3 looks like 0, 0x40, 0x80, 0xC0.
0 is register indirect with no displacement
0x40 is register indirect with an 8 bit displacement
0x80 is register indirect with a 32bit displacement
0xC0 is register direct.
The three bit fields of course take on 8 values 0-7.
The registers are numbered, eax=0, ecx=1, edx=2, etc.
For example, let's suppose "add" is 0. (It sort of is.)
Let's use "0b" for binary.
add edx, [ecx]
would be 0b00 010 001
add ecx, [edx+4]
would be 0b00 001 010 4
add ecx, [edx+0x12345678]
would be 0b00 001 010 78 56 34 12
add ecx, edx
would be 0 0b00 001 010
If r/m is 4 (or 5? need to check this), the rules change slightly.
Instead of that being a register in the normal scheme, it means there is a "SIB" byte.
"SIB" is scale-index-base.
You can say things like:
add eax, [4*edx+ecx]
where 8 is scale
edx is index
ecx is base
Imagine a function like:
int get_array_element(int * array, int index)
{
return array[index];
}
Let's pretend array is in ebx, index is in ecx.
This would look like
mov eax, [4 * ecx + ebx]
The SIB byte, similar to the modrm byte, has three fields:
2 bit scale
3 bit index
3 bit base
scale 0: 1
scale 1: 2
scale 2: 4
scale 3: 8
There is a little more to this but I have to run for now.
There are values for the SIB fields that mean no register.
There is also extending this to 64bits and providing RIP-relative addressing therein.
No comments:
Post a Comment