keygenme_v7 (Part 2)
14 June, 2020
This post is the second part on the keygenme_v7 crackme. The first part can be found here. It is recommended that you read the first part, since it gently introduces some concepts that are applied in a much more advanced context here.
We left the first part having tackled a large chunk of the executable code in the binary, leaving only:
- The unexplored branches of the window handler function at 0x402121
- The hook applied to
CharUpperW()
at 0x4017BE - The function at 0x401856
The majority of the branches in the window handler function are uninteresting,
and just set up the buttons and text, or tear down the window when it’s closed.
The branch we are interested in is at 0x402181, which with a debugger we can
observe is followed when we click the ‘Register’ button. In this case, the jump
is not taken, and two addresses are pushed to the stack followed by a ret
, a
pattern we’ve seen before. The two addresses are 0x40218E, the other side of
this branch, and 0x401856. The return instruction causes execution to jump to
that latter address.
sub_401856
Here, finally, is where the user input is processed. The function immediately
starts with three calls to GetWindowTextLengthW()
. The first two take the
address 0x403024, the latter takes 0x40302C. Using a debugger, we can see that
the first is the ‘Username’ field, and the latter is the ‘Serial’. The return
of each call is followed by a comparison and a jump, and if the jump is taken
then the function returns. This imposes size restraints on the number of
characters in the two fields. These are:
- Username is between 3 and 16 characters, inclusive
- Serial is exactly 32 characters
If we input with fields within those constraints (for now lets use three As for
the username, and 32 Bs for the serial), we get to the block of code at
0x4018A0. A lot of stack variables and one global are initialised here, most to
zero, but [ebp-0x30]
is set to -1, and the global byte at 0x403020 is set to 1.
Two buffers are initialised to a range of zero bytes with memset()
, the first
is 34 bytes, the second is 66. Each of these buffers are then used as input to
the GetDlgItemTextW()
function, which pulls out the input strings from the text
fields. The username goes into the smaller buffer, the serial into the larger.
It would therefore be useful to mark in the disassembler that:
[ebp+C0]
= username[ebp+104]
= serial
In both cases, the maximum size in characters of the string passed to the
function is the size we established in the constraints above, plus one (so 17
for the username, 33 for the serial). These are both exactly half the size
initialised earlier, and this is because both strings are WCHAR strings, and so
each character uses two bytes. With the username and serial pulled from the UI,
the address to the serial is pushed as the sole argument to CharUpperW()
,
which really calls the hooked function at 0x4017BE.
sub_4017BE
Immediately this function checks the global byte at 0x403020. If it is zero, it
takes a jump right to the end of the function, which uses the stub created when
the hook was made to call the real CharUpperW()
and then exit. Effectively,
the global byte is a switch to turn the additional behaviour of this function
on or off. In this case, however, we know the byte is 1 as it was just set
earlier, so the jump is not taken.
Instead then, the register esi
is loaded with the base address of our string,
and the 16-bit value at [esi]
is compared to zero. Later in the code, esi
is incremented by two, so we know this is a for loop that runs over the entire
input string.
While looping through the string, each character is compared against a switch statement. The switch statement checks the value lies within the range of 48 and 57, which is the character encoding values for the digits 0-9. Each of these cases takes the value and modifies it slightly by adding or subtracting a number. This essentially means each digit in the input string - in this case our serial - has the digits swapped around according to these following rules:
Input | 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 |
---|---|---|---|---|---|---|---|---|---|---|
Output | 5 | 7 | 4 | 1 | 0 | 9 | 8 | 3 | 2 | 6 |
Once this loop exits, the end of the function is reached - the string is passed
to the real CharUpperW()
and returned. We therefore know that our input
serial will now only consist of uppercase letters, and digits that have been
swapped around.
loc_40193D
Returning from the hooked CharUpperW()
then goes to a line that initialises
another stack variable to zero, then jumps to 0x40193D. The stack variable is
used as a loop counter for a loop that goes over the entire username, but to
spoil things a bit here, the loop doesn’t really do anything. It uses each
character value to calculate two integers. One is never touch again, and the
other is then used as a constraint in the next loop at 0x4019D4, which does
nothing. You can tell this using a disassembler and checking the cross
references to the stack address - there are no reads that happen to the
address later in the program. This kind of useless code is an obfuscation
technique designed to waste time while reverse engineering.
Skipping that codes takes us to address 0x4019E5, which contains yet more useless code - it pushes eax and edx, then zeroes them and performs some arithmetic (that does nothing as the values are all zero), then pops the registers back again. And the end of this block, another stack variable is set to zero - another loop index - and a jump is taken to 0x401A03.
The beginning of this loop immediately checks if the loop variable on the stack
is greater than 32 - the size of the input serial. If it is less, the jump is
not taken. The loop variable is then tested against zero. If it is not, another
block executes, if it is then the block is skipped and execution jumps straight
to 0x401A4E. The small block that executes if the loop variable is not zero
uses the idiv
command to get the remainder of the loop variable divided by 7.
If it is zero, a jump is taken again to 0x401A4E. Essentially then, we can
interpret this bit of code as:
for (int i = 0; i < 32; i++)
{
if ((i == 0) || (i % 7 == 0))
// loc_401A4E
else
// loc_401A20
}
We’ll consider the case when either of the jumps are taken (0x401A4E) first,
since it executes first in reality. This bit yet again starts with the exact
useless arithmetic code from before, at 0x4019E5. Perhaps it was a function
that has been inlined. Skipping that takes us to a point where the loop
variable is loaded into ecx
, incremented twice, and then in a very
roundabout way divided by 33 and the remainder taken. If the remainder is
zero, a jump is taken to 0x401A86. If it isn’t, then the loop variable is
stored in eax
and AND’d with a bitmask of 0x80000003, then followed by a
jns
- “jump if not sign”. This means that if the loop variable is negative,
that bit will persist (since the most significant bit of the bitmask is set),
and the jump is not taken. The jump goes to 0x401A7E, which checks if after the
AND, eax
is zero or not. This will be the case if the loop variable is both
not negative, and the last two bits were not set. If it is not zero, a jump is
taken to 0x401CEF, otherwise execution continues back to 0x401A86, where our
previous jump could have gone. Returning to our code segment above, this than
be written out as:
for (int i = 0; i < 32; i++)
{
if ((i == 0) || (i % 7 == 0))
{
if (((i + 2) % 33 == 0) || ((i > 0) && (i & 4 == 0)))
// loc_401A86
else
// loc_401CEF
}
else
// loc_401A20
}