Efficiency: Reverse Engineering with ghidra

It’s been a while since I haven’t written anything on here, but I thought I’d do a quick write-up for one of the challenges from RTFM.

We’re given a single binary and told to find a flag. The executable

~/Projects/ctf/rtfm$ file efficiency_fixed
efficiency_fixed: ELF 64-bit LSB shared object, x86-64, ...
~/Projects/ctf/rtfm$ ./efficiency_fixed
Please enter the password:
foobar

We don’t get any more output after we entered the “password” (foobar in the example above.)

Enter Ghidra

I’m going to be using Ghidra for the reverse engineering but other tools work in similar ways and this write-up could probably be followed with any of them.

After creating a project in ghidra for the CTF (or just using your everything-goes-here project) and after using File > Import File to add our binary to the project we can open it tell Ghidra that Yes we would like to analyze the file right now when prompted.

In the Symbol Tree we notice there are only a dozen functions we don’t recognize so let’s start looking at what they do.

void FUN_00101020(void) {
  // WARNING: Treating indirect jump as call
  (*(code *)(undefined *)0x0)();
  return;
}

Okay. Looking at the listing (assembly) for this we see it’s an indirect jump to something taken at a global address. This function also doesn’t seem to be referenced by any of the others. Let’s skip this for now, probably not important.

Lots of small functions

FUN_001010a0 and FUN_001010d0 are no more interesting than the previous one, but then we have a whole lot of very small functions that seem to actually do stuff.

Most of them take two arguments and do a very simple operations. This looks a lot like the naive implementation of byte-code interpreters we are used to seeing in CTFs. Let’s rename all those functions to what we assume they’ll be doing:

void FUN_00101155(undefined4 *puParm1, undefined4 *puParm2) {
  *puParm1 = *puParm2;
  return;
}

FUN_00101155 seems to be some sort of mov instruction. Let’s rename it to do_mov_aX_bX. I like being explicit about the number of arguments they take (following “destination before source” convention) and about whether they deference them or not. The X indicates those arguments are dereferenced which would be obvious for the destination (a) but not so much for the source (b). At this point you could also take the time to update the function prototypes in Ghidra, letting it know that they take (int *, int *) (for most of them.)

If we do the same for all the little functions we end up with:

void do_mov_aX_bX(undefined4 *puParm1, undefined4 *puParm2) {
  *puParm1 = *puParm2;
}

void do_mov_aX_b(undefined4 *puParm1, undefined4 uParm2) {
  *puParm1 = uParm2;
}

void do_xor_aX_bX(uint *puParm1, uint *puParm2) {
  *puParm1 = *puParm1 ^ *puParm2;
}

void do_add_aX_bX(int *piParm1, int *piParm2) {
  *piParm1 = *piParm1 + *piParm2;
}

void do_sub_aX_bX(int *piParm1, int *piParm2) {
  *piParm1 = *piParm1 - *piParm2;
}

void do_rol_aX_bX(int *piParm1, undefined8 uParm2) {
  *piParm1 = *piParm1 << ((byte)uParm2 & 0x1f);
}

void do_ror_aX_bX(int *piParm1, undefined8 uParm2) {
  *piParm1 = *piParm1 >> ((byte)uParm2 & 0x1f);
}

Dealing with global variables

The functions above all follow the exact same pattern so it’s easy to name them. We also encounter some of those small functions that read and write to some global DAT_something variables. Let’s just continue with our naming scheme until we find anything better to do:

void do_mov_DAT_0010506c_aX(undefined4 *puParm1) {
  DAT_0010506c = *puParm1;
}

void do_mov_DAT_00105068_cmp_aX_bX(int *piParm1, int *piParm2) {
  DAT_00105068 = *piParm1 != *piParm2;
}

With that last one we’re starting to see that those global variables might be used to store things like flags and important information about the state of the virtual machine that might (we’re guessing!) be running inside this binary.

Our suspicions are confirmed by the following three functions which look an awful lot like (unconditional) jmp, jnz (jmp if non zero) and jz (jmp if zero).


void do_jmp_a(undefined4 uParm1) {
  DAT_00105064 = uParm1;
}

void do_jnz_a(undefined4 uParm1) {
  if (DAT_00105068 != 0) {
    DAT_00105064 = uParm1;
  }
}

void do_jz_a(undefined4 uParm1) {
  if (DAT_00105068 == 0) {
    DAT_00105064 = uParm1;
  }
}

We are now presuming the following:

  • DAT_00105068 is actually a global SHOULDJMP flag.
  • DAT_00105064 is what is modified by the jump, it’s reasonable to assume it is the current instruction pointer DATA_IP.

We still haven’t seen anything resembling the main function, or whatever it is that is prompting us for a password when running the binary. We could go searching for that but let’s just finish all functions in order, we’re almost done anyway.

FUN_001012c6 - Re-discovering mathematics.

The next function is somewhat more complicated than the small operations we’ve been seeing so far.

void FUN_001012c6(int *piParm1,int *piParm2) {
  long local_20;
  ulong local_18;
  long local_10;

  local_10 = 1;
  local_18 = SEXT48(DAT_00104074);
  local_20 = (long)*piParm1;
  while (local_18 != 0) {
    if ((local_18 & 1) != 0) {
      local_10 = SUB168((ZEXT816(0) << 0x40 | ZEXT816((ulong)(local_10 * local_20))) %
                        ZEXT816((ulong)(long)*piParm2),0);
    }
    local_18 >>= 1;
    local_20 = SUB168((ZEXT816(0) << 0x40 | ZEXT816((ulong)(local_20 * local_20))) %
                      ZEXT816((ulong)(long)*piParm2),0);
  }
  *piParm1 = (int)local_10;
}

SEXT48, SUB168 and ZEXT816 are not standard C and might look weird, ghidra’s built-in help is really good and deserves to be used a lot.

In the Decompiler section we find that:

SUB41(x,c) - truncation operation The 4 is the size of the input operand (x) in bytes. The 1 is the size of the output value in bytes. The x is the thing being truncated The c is the number of least significant bytes being truncated

EXT14(x) - zero extension The 1 is the size of the operand x The 4 is the size of the output in bytes This is almost always a cast from small integer types to big unsigned types.

SEXT14(x) - signed extension The 1 is the size of the operand x The 4 is the size of the output in bytes This is probably a cast from a small signed integer into a big signed integer.

All this is simply an indication that the compiler had to juggle between 4 byte, 8 byte and 16 byte operands for the code we’re looking at. In most cases we probably don’t care about that for understanding the code so let’s simplify for readability:

void FUN_001012c6(int *piParm1,int *piParm2) {
  long local_20;
  ulong bit_vector;
  long result;

  result = 1;
  bit_vector = DAT_00104074;
  local_20 = *piParm1;
  while (bit_vector != 0) {
    if ((bit_vector & 1) != 0) {
      result = (result * local_20) % *piParm2;
    }
    bit_vector >>= 1;
    local_20 = (local_20 * local_20) % *piParm2;
  }
  *piParm1 = result;
}

Notice that we’re looping over all the bits inbit_vector (originally local_18) and computing local_10 (renamed to result) which gets assigned to *piParm1 as usual with all the operations we’ve seen so far. *piParm2 seems to be some sort of modulus as it’s only used for that.

According to ghidra DAT_00104074 is 0x10001 and there are no other direct references to it (apart from the function we’re looking at) that could modify it so for now let’s assume that’s a constant.

Rewriting this function in python and unrolling the loop:

def do_something(a, b):
    result = 1

    # first bit is always 1
    result = (result * a) % b
    a = (a * a) % b

    # then we always have 15 bits that are 0
    # a = (a * a) % b  # 15 times, which is a ** 2 ** 15
    a = (a ** 2 ** 15) % b

    # last bit is always a 1 again
    result = (result * a) % b
    a = (a * a) % b  # Notice this line does not affect the result.

    return result

Simplifying the above again we get:

def do_something(a, b):
    result = a % b
    # Now a changes:
    a = a ** 2 ** 16  # because: ((a ** 2) ** 2 ** 15) % b
    # And is used again in the result:
    return (result * a) % b

Finally we end up with return (a * a ** 2 ** 16) % b

  • which is return (a ** (2 ** 16 + 1)) % b
  • which is return (a ** 0x10001) % b

So this function is basically just a do_aX_pow_0x10001_mod_bX operation, If you couldn’t tell because of my way of solving maths above: I’m not really good at maths.

And with that it looks like we’re done with all the small-ish functions that take one or two parameters and store their result in the first one.

On to main.


undefined8 FUN_001017e4(void) {
  // ...
  puts("Please enter the password: ");
  read(0,&local_28,0x14);
  FUN_0010138b(&local_28);
}

Seems like we need to look at FUN_0010138b next: it takes our password as input.

It’s a big function! It has two while loops followed by a do { } while with lots of if/else inside. Looks like we found whatever it is that’s going to call all of the functions read before.

Let’s focus on the big mess first:

 do {
  a = local_data[(long)(DATA_IP + 1)];
  b = local_data[(long)(DATA_IP + 2)];
  op = local_data[(long)DATA_IP];
  if (op == 0x789abcde) {
   do_cmp_ax_bx(&DAT_00104060 + (long)a,&DAT_00104060 + (long)(int)b,
          &DAT_00104060 + (long)(int)b);
  }
  else {
   if (op < 0x789abcdf) {
    if (op == 0x6789abcd) {
     do_jmp_a();
    }
    else {
     if (op < 0x6789abce) {
      if (op == 0x56789abc) {
       do_jz_a();
      }
      else {
       if (op < 0x56789abd) {
        if (op == 0x456789ab) {
         do_add_aX_bX(&DAT_00104060 + (long)a,&DAT_00104060 + (long)(int)b,
                &DAT_00104060 + (long)(int)b);
        }
        else {
         if (op < 0x456789ac) {
          if (op == 0x3456789a) {
           do_xor_aX_bX(&DAT_00104060 + (long)a,&DAT_00104060 + (long)(int)b,
                  &DAT_00104060 + (long)(int)b);
          }
          else {
           if (op < 0x3456789b) {
            if (op == 0x23456789) {
             do_mov_aX_b(&DAT_00104060 + (long)a,(ulong)b);
            }
            else {
             if (op < 0x2345678a) {
              if (op == 0x12345678) {
               do_mov_aX_bX();
              }
              else {
               if (op < 0x12345679) {
                if (op == -0x10fedcbb) {
                 do_sub_aX_bX();
                }
                else {
                 if (op < -0x10fedcba) {
                  if (op == -0x210fedcc) {
                   do_jnz_a();
                  }
                  else {
                   if (op < -0x210fedcb) {
                    if (op == -0x3210fedd) {
                     do_aX_pow_0x10001_mod_bX();
                    }
                    else {
                     if (op < -0x3210fedc) {
                      if (op == -0x43210fee) {
                       do_mov_DAT_0010506c_aX();
                      }
                      else {
                       if (op < -0x43210fed) {
                        if (op == -0x543210ff) {
          // WARNING: Subroutine does not return
                         exit(DAT_0010506c);
                        }
                        if (op < -0x543210fe) {
                         if (op == -0x76543211) {
                          do_rol_aX_b();
                         }
                         else {
                          if (op == -0x65432110) {
                           do_ror_aX_b();
                          }
                         }
                        }
                       }
                      }
                     }
                    }
                   }
                  }
                 }
                }
               }
              }
             }
            }
           }
          }
         }
        }
       }
      }
     }
    }
   }
  }
  DATA_IP += 3;
 } while( true );

All the < comparisons are compilation artifacts, the original code probably had a switch/case and the compiler decided to implement some sort of binary balanced tree to reduce the number of comparisons needed for each case. We can safely remove them since the == comparisons are sufficient.

We also notice that some of the functions we know about (eg: do_mov_aX_bX) don’t seem to take the right numbers of arguments. It’s hard for the decompiler to guess those in all circumstances but we can help! right click > Edit Function Signature and we can set the prototype we wish for all those functions.

With those two things everything looks much cleaner:

do {
    param_a = local_data[(DATA_IP + 1)];
    param_b = local_data[(DATA_IP + 2)];
    op = local_data[DATA_IP];
    if (op == 0x789abcde) {
        do_cmp_ax_bx();
    }
    if (op == 0x6789abcd) {
        do_jmp_a((param_a + -1) * 3);
    }
    if (op == 0x56789abc) {
        do_jz_a((param_a + -1) * 3);
    }
    if (op == 0x456789ab) {
        do_add_ax_bx
            (&DAT_00104060 + param_a,
             &DAT_00104060 + param_b);
    }
    if (op == 0x3456789a) {
        do_xor_ax_bx
            (&DAT_00104060 + param_a,
             &DAT_00104060 + param_b);
    }
    if (op == 0x23456789) {
        do_mov_ax_b
            (&DAT_00104060 + param_a,
             param_b);
    }

    if (op == 0x12345678) {
        do_mov_ax_bx
            (&DAT_00104060 + param_a,
             &DAT_00104060 + param_b);
    }
    if (op == -0x10fedcbb) {
        do_sub_ax_bx
            (&DAT_00104060 + param_a,
             &DAT_00104060 + param_b);
    }
    if (op == -0x210fedcc) {
        do_jnz_a((param_a + -1) * 3);
    }
    if (op == -0x3210fedd) {
        do_aX_pow_0x10001_mod_bX
            (&DAT_00104060 + param_a,
             &DAT_00104060 + param_b);
    }
    if (op == -0x43210fee) {
        do_mov_DATA_ax(&DAT_00104060 + param_a);
    }
    if (op == -0x543210ff) {
        // WARNING: Subroutine does not return
        exit(DAT_0010506c);
    }
    if (op == -0x76543211) {
        do_rol_ax_b(&DAT_00104060 + param_a,
                    param_b);
    }
    if (op == -0x65432110) {
        do_ror_ax_b(&DAT_00104060 + param_a,
                    param_b);
    }
    DATA_IP += 3;
 }  while (true);

Much more readable already !

A couple things we learned from this:

  • DAT_0010506c is the value with which the program will exit, which means that do_mov_DAT_0010506c_aX can be renamed to do_mov_DAT_EXIT_aX or do_ldexit_aX.
  • We now have a mapping between “instruction numbers” and functionality.
  • It looks like instructions are encoded on 3 * 4bytes in local_data.

Getting the code

At this point we could either attach gdb and set a breakpoint when we know local_data has been initialized and dump the instructions like that.

Or read the RTFM (rest of fucking main) and learn that it’s being loaded from global memory, relevant code;

  puVar4 = &DAT_00102020;
  puVar5 = local_data;
  while (lVar3 != 0) {
    lVar3 += -1;
    *puVar5 = *puVar4;
    puVar4 = puVar4 + 1;
    puVar5 = puVar5 + 1;
  }

Either way we find the code and we can start writing a quick dis-assembler for the operations we now about:

#!/usr/bin/env python3

import struct


DATA_CODE = bytes.fromhex(
    "896745230400000003000000cdab8967040000000000000012f0debcff030000"
    "0000000001efcdab000000000000000089674523050000000100010089674523"
    "00010000fa20140389674523000200002d74c77789674523010100006bda742b"
    "89674523010200002de3617d8967452302010000bf8286638967452302020000"
    "19bc4d7b896745230301000021d7415989674523030200005f6ec26289674523"
    "04010000bb41ed5c8967452304020000f79364682301efcd1000000000020000"
    "debc9a7810000000000100003412f0de02000000000000002301efcd11000000"
    "01020000debc9a7811000000010100003412f0de02000000000000002301efcd"
    "1200000002020000debc9a7812000000020100003412f0de0200000000000000"
    "2301efcd1300000003020000debc9a7813000000030100003412f0de02000000"
    "000000002301efcd1400000004020000debc9a7814000000040100003412f0de"
    "020000000000000089674523ff03000001000000cdab89670200000000000000"
)


def arg(x):
    """Represent a direct argument."""
    return "%#x" % x


def argX(x):
    """Represent a de-referenced argument."""
    return "[%#x]" % x


def noarg(x):
    """Represent the absence of argument."""
    return ""


OP_MAP = {
    0x789abcde:  ("cmp",    argX,  argX),
    0x6789abcd:  ("jmp",    arg,   noarg),
    0x56789abc:  ("jz",     arg,   noarg),
    0x456789ab:  ("add",    argX,  argX),
    0x3456789a:  ("xor",    argX,  argX),
    0x23456789:  ("mov",    argX,  arg),
    0x12345678:  ("mov",    argX,  argX),
    -0x10fedcbb: ("sub",    argX,  argX),
    -0x210fedcc: ("jnz",    arg,   noarg),
    -0x3210fedd: ("powmod", argX,  argX),
    -0x43210fee: ("ldexit", argX,  noarg),
    -0x543210ff: ("exit",   noarg, noarg),
    -0x76543211: ("rol",    argX,  arg),
    -0x65432110: ("ror",    argX,  arg),
}


if __name__ == "__main__":

    # Loop over DATA_CODE in chunks of 3*4 bytes because we will load
    # the op and the two args which are four bytes each.
    for i, ci in enumerate(range(0, len(DATA_CODE), 3*4)):
        op, a, b = struct.unpack("iii", DATA_CODE[ci:ci+3*4])
        op, rep_a, rep_b = OP_MAP[op]
        print("%2d:  %8s  %8s %s" %  (i, op, rep_a(a), rep_b(b)))

This program when run gives us a nice idea of what we are looking at.

Understanding the code

~/Projects/ctf/rtfm$ ./writeup.py
 0:       mov     [0x4] 0x3
 1:       jmp       0x4
 2:    ldexit   [0x3ff]
 3:      exit
 4:       mov     [0x5] 0x10001
 5:       mov   [0x100] 0x31420fa
 6:       mov   [0x200] 0x77c7742d
 7:       mov   [0x101] 0x2b74da6b
 8:       mov   [0x201] 0x7d61e32d
 9:       mov   [0x102] 0x638682bf
10:       mov   [0x202] 0x7b4dbc19
11:       mov   [0x103] 0x5941d721
12:       mov   [0x203] 0x62c26e5f
13:       mov   [0x104] 0x5ced41bb
14:       mov   [0x204] 0x686493f7
15:    powmod    [0x10] [0x200]
16:       cmp    [0x10] [0x100]
17:       jnz       0x2
18:    powmod    [0x11] [0x201]
19:       cmp    [0x11] [0x101]
20:       jnz       0x2
21:    powmod    [0x12] [0x202]
22:       cmp    [0x12] [0x102]
23:       jnz       0x2
24:    powmod    [0x13] [0x203]
25:       cmp    [0x13] [0x103]
26:       jnz       0x2
27:    powmod    [0x14] [0x204]
28:       cmp    [0x14] [0x104]
29:       jnz       0x2
30:       mov   [0x3ff] 0x1
31:       jmp       0x2

Reading this it looks like we always end up jumping to 0x2 which ldexit [0x3ff] and then exit with that value.

Since we still don’t know where we’re going to get the flag the best we can do is try to exit with a non-zero value. (Since when we input garbage as a password the program exits with status code 0.)

The only way to do that is on line 30: when 0x1 is loaded into [0x3ff] so the goal becomes to get there.

To do so we need to successfully avoid all the jnz before that.

Re-ordering the code for clarity we get:

 5:       mov   [0x100] 0x31420fa
 6:       mov   [0x200] 0x77c7742d
15:    powmod    [0x10] [0x200]
16:       cmp    [0x10] [0x100]
17:       jnz       0x2

 7:       mov   [0x101] 0x2b74da6b
 8:       mov   [0x201] 0x7d61e32d
18:    powmod    [0x11] [0x201]
19:       cmp    [0x11] [0x101]
20:       jnz       0x2

 9:       mov   [0x102] 0x638682bf
10:       mov   [0x202] 0x7b4dbc19
21:    powmod    [0x12] [0x202]
22:       cmp    [0x12] [0x102]
23:       jnz       0x2

11:       mov   [0x103] 0x5941d721
12:       mov   [0x203] 0x62c26e5f
24:    powmod    [0x13] [0x203]
25:       cmp    [0x13] [0x103]
26:       jnz       0x2

13:       mov   [0x104] 0x5ced41bb
14:       mov   [0x204] 0x686493f7
27:    powmod    [0x14] [0x204]
28:       cmp    [0x14] [0x104]
29:       jnz       0x2

We know that powmod is a ** 0x10001 % b so this gives us:

0x031420fa == [0x10] ** 0x10001 % 0x77c7742d
0x2b74da6b == [0x11] ** 0x10001 % 0x7d61e32d
0x638682bf == [0x12] ** 0x10001 % 0x7b4dbc19
0x5941d721 == [0x13] ** 0x10001 % 0x62c26e5f
0x5ced41bb == [0x14] ** 0x10001 % 0x686493f7

At this point I’m starting to assume that 0x10-0x14 will contain our input in some form. So let’s solve these equations and see what gives.

Wait. I’m bad at math.

https://www.wolframalpha.com/

But it doesn’t understand python. Rewrite everything to human-ish-speak.

x ^ 65537 = 51650810 mod 2009560109
x ^ 65537 = 729078379 mod 2103567149
x ^ 65537 = 1669759679 mod 2068691993
x ^ 65537 = 1497487137 mod 1656909407
x ^ 65537 = 1559052731 mod 1751421943

Copy pasting the above (line by line) in wolfram we get back:

x congruent 1936287603 (mod 2009560109)
x congruent 1701279355 (mod 2103567149)
x congruent 1447900004 (mod 2068691993)
x congruent 1601401973 (mod 1656909407)
x congruent 1717969277 (mod 1751421943)

Let’s see what that means if that was part of a password:

>>> results = [1936287603, 1701279355, 1447900004, 1601401973, 1717969277]
>>> print(struct.pack("iiiii", *results))
b'sgis{vged3MVuts_}!ff'

That almost looks like a flag, but it might be big endian instead of (default) little endian. Let’s try again:

>>> print(struct.pack(">iiiii", *results))
b'sigsegv{VM3d_stuff!}'

There we go.

All in all this was an excellent typical reverse engineering challenge to practice understanding ctf-style bytecode interpreters.

Comments