Wednesday, January 29, 2014

x86 instruction encoding examples

/* To follow up from earlier, let's go through some examples of x86 instruction encoding, focusing on the "modrm" and "SIB" bytes.

  The calling convention is that the first four integers/pointers are in rcx, rdx, r8, r9.  
  cl -Zi -GS- -GL -O1t  2.c -FAsc -LD -link -nod -noentry && link /dump /symbols /disasm 2.dll | more

  #include <stddef.h>
  typedef unsigned UINT; 
  #define EXPORT __declspec(dllexport)  /* to reduce line length */

  UINT b; 
  EXPORT UINT register_direct(UINT a) { return a+b; } 
    3 C1 add eax ecx   
    3 is add (there are other add opcodes, keep reading)   
    c1 is 11000001 
    11 is register direct 
    000 is eax   
    001 is ecx   

  EXPORT void register_indirect(UINT a, UINT * b) { *b += a; } 
   1 A add dword ptr[rdx], ecx  
   1 is add (there are other add opcodes; in this case, direction is reversed)  
   A is 00001010  
   00 is register indirect  
   001 is ecx/rcx  
   010 is edx/rdx  

  EXPORT void register_indirect_displacement8(UINT a, UINT * b) { b[0x78/4] += a; } 
  1 4A 78 add dword ptr[rdx+78],ecx 
  1 is add again 
  4A is 01001010 
  01 is register indirect with 8 bit displacement 
  001 is ecx/rcx 
  010 is edx/rdx 
  78 is the displacement 

  EXPORT void register_indirect_displacement32(UINT a, UINT * b) { b[0x1234/4] += a; }   
  1 8A 34 12 00 00 add dword ptr[rdx+1234], ecx 
  1 is add 
  8A is 10001010 
  10 is register indirect with 32bit displacment 
  001 is ecx/rcx 
  010 is edx/rdx   

  EXPORT UINT sib_without_displacement(UINT a, UINT * b) { return b[a]; } 
   mov eax, ecx
 8B 04 82  mov eax, dword ptr[rdx+rax*4]
 8B is mov
 modrm & 7 == 4 means there is a SIB byte
  82 is the SIB byte
  82 is 10000010
  10 is scale = 1 << 10 == 4
  000 is index = rax, index is the one multiplied by scale
  010 is base = rdx
 It seems to me the compiler should have generated just one instruction:
  mov eax, dword ptr[rcx + rax*4]
  However this could be the compiler zero extending the lower 32bits. 
  We'll see in the next example. 

  EXPORT UINT sib_without_displacement_size(size_t a, UINT * b) { return b[a]; }   
 /* Yes. Here we get: 
  8B 04 8A mov eax, dword ptr[rdx+rcx*4]
  8B is mov 
  modrm 04 = 00 000 100
  00 means register indirect with no displacement
  000 is the destination register eax
  100 means there is a SIB byte
  8A is the SIB byte, 10001010, 10 is scale = 1<<10 = 4, 001 is index rcx, 010 is base rdx

  EXPORT UINT sib_displacement8(UINT a, UINT * b) { return (b+0x78/4)[a]; }
  Again we have the mov eax, ecx, ok.
  8B 44 82  78 mov eax, dword ptr[rdx+rax*4+78]
 8B is mov
 modrm 44 = 01000100 = 01 000 100
  01 is register indirect with 8 bit displacement
  000 is the destination register eax
  100 for r/m means there is a SIB bte
 the SIB byte is 82 = 10000010
  10 is again scale = 4
  000 is the index = rax
  010 is the base = rdcx
  78 is the displacement (or offset)

  EXPORT UINT sib_displacement32(UINT a, UINT * b) { return (b+0x1234/4)[a]; }
 again the mov eax, ecx
 8B 84 82 34 12 00 00 mov eax, dword ptr[rdx+rax*4+1234]
 8B is mov
 modrm 84 = 10000100 = 10 000 100
 10 is register indirect with 32 bit displacement (or offset)
 000 is destination register eax
 100 means there is a SIB byte
 SIB = 82 = 10000010 = 10 000 010
  10 is scale = 4
  000 is index rax
  101 is base rdx
  34 12 00 00 are the displacement bytes

  #if defined(_AMD64_) || defined(_M_AMD64)   
  UINT a[100];   
  // rip relative is very limited -- no scale/index/base/displacement   
  // just rip + offset   
  EXPORT UINT rip_relative() { return a[0]; }   
    8B 5 .. .. .. ..  mov eax, dword ptr[a]  
    8B is mov   
    modrm 5 = 00 000 101  
    00 is register indirect with no displacement 
    000 is the destination register eax  
    101 means RIP relative, and is only allowed with mode == 00  
     Consider if there was a constant 8 or 32bit displacement, it could just be combined with the RIP-relative offset, except 
     it'd give you a little more distance you could cover (8 bit + 32bit) or double the distance (32 bit + 32bit)  
    Then there are 4 bytes for the offset.  

   EXPORT void rip_relative2(UINT b) { a[0] += b; }    
   /* Almost the same, but I wanted to avoid a field of zeros for rax.  
   1 D .. .. .. .. add dword ptr[a], ecx  
   modrm = D = 00001101 = 00 001 101  
    00 mode register indirect  
    001 ecx  
    101 RIP relative  
   Notice that sometimes in these examples add is 1 and sometimes it is 3.  
   There are even more options. 
   Some opcodes have a "direction" in them. From these examples, we can see that is the second bit, the value 2.   

   /* Now let's demonstrate register numbering. Here I am limited to a 32 bit system. 
   A good way to see how some bytes decode is to enter them in arbitrary memory in a debugger, a debugger 
    you started just for this. 
   I do this: 
     \bin\x86\cdb cmd  
     to start up the Windows console debugger on a new dummy command line process.  
      Then I use "eb" for edit bytes, "." for current instruction pointer (EIP or RIP), and "u" for unassemble (disassemble) and "L1" for length 1.   
      I suppose this is what people used to use MS-DOS "debug.exe" for. 
      Like this:   
    \bin\x86\cdb cmd   
   0:000> eb . 1 2 3 4 ; u . l1 
   0102            add     dword ptr [edx],eax
   1 is add 
   modrm 2 = 00000010 = 00 000 010  
   mode 00 register indirect with no displacement
   000 = eax
   010 is edx
   So let's see exactly how all the registers are numbered.  
    0:000> eb . 1 0<<3  ; u . l1 
     0100            add     dword ptr [eax],eax   
    0:000> eb . 1 1<<3  ; u . l1   
     0108            add     dword ptr [eax],ecx   
    0:000> eb . 1 2<<3  ; u . l1   
     0110            add     dword ptr [eax],edx   
    0:000> eb . 1 3<<3  ; u . l1 
     0118            add     dword ptr [eax],ebx   
    0:000> eb . 1 4<<3  ; u . l1   
     0120            add     dword ptr [eax],esp   
    0:000> eb . 1 5<<3  ; u . l1   
     0128            add     dword ptr [eax],ebp   
    0:000> eb . 1 6<<3  ; u . l1   
     0130            add     dword ptr [eax],esi   
    0:000> eb . 1 7<<3  ; u . l1   
     0138            add     dword ptr [eax],edi   
    Remember not to take this as the entire truth, because there are special cases to indicate RIP relative or SIB byte presence.  
    The special cases involve esp/rsp/4 and ebp/rbp/5.  
    Those registers are not quite as general as the others.  
    And yet I still haven't covered 64bit changes..  

 If you are really interested in this stuff, I encourage you to go through it all in complete detail and probably change the samples or write your own. A nice change is to reorder the parameters or add extra "dummy" parameters to push the values into other registers. I suggest no more than 4 parameters per function for learning purposes, otherwise you'll get extra instructions reading the values off of the stack and lose predictability as to which register is used.


Sunday, January 12, 2014

x86 instruction encoding

This is well documented in the manuals.
x86 instructions look like this:

 optional prefix bytes
 opcode bytes 

The maximum size is 15 bytes.
Prefix bytes include segment overrides, size overrides, lock prefix, repe/repne.
The presence of modrm and immediate is dependent on the opcode.
The presence of displacement depends on opcode and/or modrm.

Some opcodes are one byte, like push/pop.
Some opcodes have implied register use, like push/pop.
Some opcodes have no modrm/sib but do have displacement, like jmp/call.
Many opcodes have no immediate. An example that does have immediate is add.

Let's dig into modrm/sib.
modrm is a byte with three fields.
 two bit mode, let's call it mod.
 three bit reg, let's call it r
 three bit reg or memory, let's call it r/m

The layout is left to right, so the two bit mode 0-3 looks like 0, 0x40, 0x80, 0xC0.

0 is register indirect with no displacement
0x40 is register indirect with an 8 bit displacement
0x80 is register indirect with a 32bit displacement
0xC0 is register direct.

The three bit fields of course take on 8 values 0-7.
The registers are numbered, eax=0, ecx=1, edx=2, etc.

For example, let's suppose "add" is 0. (It sort of is.)

Let's use "0b" for binary.

add edx, [ecx]
would be 0b00 010 001

add ecx, [edx+4]
would be 0b00 001 010 4

add ecx, [edx+0x12345678]

would be 0b00 001 010 78 56 34 12

add ecx, edx
would be 0 0b00 001 010

If r/m is 4 (or 5? need to check this), the rules change slightly.
Instead of that being a register in the normal scheme, it means there is a "SIB" byte.
"SIB" is scale-index-base.

You can say things like:
 add eax, [4*edx+ecx] 
 where 8 is scale
 edx is index
 ecx is base

Imagine a function like:

int get_array_element(int * array, int index)
  return array[index];

Let's pretend array is in ebx, index is in ecx.

This would look like
 mov eax, [4 * ecx + ebx]

The SIB byte, similar to the modrm byte, has three fields:
  2 bit scale
  3 bit index
  3 bit base

scale 0: 1
scale 1: 2
scale 2: 4
scale 3: 8

There is a little more to this but I have to run for now.
There are values for the SIB fields that mean no register.

There is also extending this to 64bits and providing RIP-relative addressing therein.