keygenme_v7 (Part 1)
02 June, 2020
keygenme_v7 is a crackme imported from crackmes.de, with a difficulty of level 3 in C/C++. It is a Windows-specific challenge, and is the first I’ve looked at with a GUI.
This writeup is split into two parts. This first covers the periphery areas of the program - the startup and callbacks that will take us to the keygen algorithm. The latter will cover the keygen itself, which is complex enough to warrant its own post, and builds on concepts used leading up to it.
On starting up the program, we’re immediately met with two windows - a message box that constitutes the nag screen, and a window taking a username and serial. Entering some text into the two boxes and clicking “Register” doesn’t really seem to do anything for now.
Lets take a look at the disassembly, starting from the entry point (at 0x4028E1), which then in turn immediately calls a function - 0x4012DA.
sub_4012DA
Looking at this function in a disassembler, we see some constant strings of
DLL and function names, and well as calls to GetModuleHandleW()
and
GetProcAddress()
. This is a common technique for loading functions from DLLs
at runtime, and is here likely used as an obfuscation technique. The main
takeaways from this that should be marked in our disassembler are:
dword_403008
=wcsncmp()
dword_403000
=memset()
dword_403004
=wcsncpy()
dword_40300C
=memcpy()
dword_403010
=InitCommonControls()
After loading all of these, the function then calls GetCurrentProcess()
, then
OpenProcessToken()
to get its own access token. We then see the constant
string “SeDebugPrivilege”, followed by a call to LookupPrivilegeValueW()
,
and some shuffling of a TOKEN_PRIVILEGES
struct before then calling
AdjustTokenPrivileges()
. Looking at the structure and function call, we see
- The
TokenHandle
parameter is the token for this process - The
DisableAllPrivileges
parameter is FALSE, so we may be enabling or disabling based on the passed struct - The
NewState
parameter is ourTOKEN_PRIVILEGES
struct - This struct has the 1 privilege in it
- The LUID of that privilege is the one we looked up with the SeDebugPrivilege string
- The attributes attached with this LUID is 2, which if we look up in winnt.h is the SE_PRIVILEGE_ENABLED constant.
Therefore, this part is enabling the SeDebugPrivilege privilege of the process. The authors of The Art of Memory Forensics: Detecting Malware and Threats in Windows, Linux and Mac Memory say:
This grants the ability to read from or write to another process’ private memory space. It allows malware to bypass the security boundaries that typically isolate processes. Practically all malware that performs code injection from user mode relies on enabling this privilege.
We can therefore expect that the program will need to read or write to the memory of another process later down the line.
That’s it for this function, and control returns back to the entry point
function, _start
.
sub_4028E1
Immediately after the previous call, the entry function calls the address that
stores InitCommonControls()
that was loaded earlier.
Once it comes back from that function, this entry function is dedicated to
setting up a WNDCLASSEXA
structure to be passed to RegisterClassExW()
. The
main point we care about here is the WNDPROC
parameter, an
“application-defined function that processes messages sent to a window”
(MSDN).
We will come back to this function later.
If the call to RegisterClassExW()
is successful, then we get the following
strange sequence:
This is the entry function, so why does it end with a ret
? Where would it be
going? The answer lies in the pushes beforehand. Since ret
pulls the top
address of the stack and sets the program counter to it, this is a long-winded
way of saying call 004013b0
. As we will see later, the other pushes give us both the argument(s) for this function call, and the future control flow for
when that call ends. Lets draw out the stack.
Before ret
:
PC = 004029F5
+----------+ Top
| 004013b0 |
|----------|
| 004029f6 |
|----------|
| 761019A0 | // CreateWindowExW
|----------|
| 00402551 |
+----------+ Bottom
After ret
:
PC = 004013b0
+----------+ Top
| 004029f6 |
|----------|
| 761019A0 | // CreateWindowExW
|----------|
| 00402551 |
+----------+ Bottom
sub_4013B0
As seen in the stack diagram, control flow now continues to 0x4013b0. The point
of the push-then-ret strategy above over a simple call
is likely obfuscation,
and because of that we cannot be sure how many arguments this function takes -
some or all of the addresses pushed before 4013b0 might be address to continue
execution flow to, while some might be arguments to this function. We will have
to analyse how the stack is used in this function to correctly determine. In
addition to the standard function prolog, this function also immediately pushes
a lot of registers. At 0x4013BC, our stack will now look like:
PC = 004013BC
+----------+ Top
| 00000008 |
|----------|
| 0000000A |
|----------|
| edi |
|----------|
| esi |
|----------|
| ebx |
|----------|
| ecx |
|----------|
| ecx |
|----------|
| ebp |
|----------|
| 004029f6 |
|----------|
| 761019A0 | // CreateWindowExW
|----------|
| 00402551 |
+----------+ Bottom
This could be saving the state of the registers to restore them at the end of
the function, however the double push of ecx stands out as strange. Next up is
a call to GetProcessHeap()
, which takes no arguments, and then the returned
HANDLE is pushed and HeapAlloc()
is called. This function takes three
arguments, the first being a handle to the heap we just obtained, and the next
two are our top two constant values on the stack. Therefore, the dwFlags
argument is 8 (HEAP_ZERO_MEMORY), and the dwBytes is 10. Those two values are
popped from our stack, meaning the top is now at the pushed value of edi. This
function call has allocated 10 bytes of memory and zeroed them out in the
process heap.
The address in the heap returned is then passed to VirtualProtect()
, along
with a dwSize parameter of 10 and an flNewProtect value of 0x40
(PAGE_EXECUTE_READWRITE) marking our allocated memory as executable.
The function is then called again with almost the same parameters, except this
time dwSize is 5, and the address being protected comes from the argument
[ebp+8]
. ebp points to the base of the stack, which is set at the beginning
of the function, and is therefore marked on this diagram:
PC = 004013DC
+----------+ Top
| edi |
|----------|
| esi |
|----------|
| ebx |
|----------|
| ecx |
|----------|
| ecx |
|----------|
| ebp | <-- ebp points here
|----------|
| 004029f6 | <-- ebp+4 (return address)
|----------|
| 761019A0 | <-- ebp+8 (arg 1, CreateWindowExW)
|----------|
| 00402551 | <-- ebp+C (arg 2)
+----------+ Bottom
Therefore, [ebp+8]
is the address of the start of the CreateWindowExW()
function that is being marked as executable. The five bytes being marked
disassemble to a standard function prolog.
This could of course be slightly different depending on your Windows version.
These 5 bytes are then copied to the start of our 10-byte buffer that was allocated at the start of the function.
The memset()
function is then used to set the first byte of the external
function (in this case the address 0x761019A0, which is CreateWindowExW()
) to
0xE9. This is the opcode for the relative x86 JMP instruction, which takes an
offset from the current position to jump to.
The function next pulls [ebp+C]
, which we can see from our diagram is
0x402551. This is stored in memory and then the distance from the start of the
CreateWindowExW()
function to the address is calculated and stored in memory.
memcpy()
is called with that as the source, a size of 4 bytes and the
destination as our CreateWindowExW()
function plus one (0x761019A1). Now, the
first 5 bytes of CreateWindowExW()
disassembles to:
The function has rewritten what CreateWindowExW()
does to instead call to a
function in the program.
A similar procedure is performed for the last 5 bytes of the buffer that was
allocated at the beginning - the relative distance from the buffer to the
CreateWindowExW()
function (plus five to avoid the new jmp) is calculated,
and a relative jmp instruction is written. The 10-byte buffer now disassembles
to:
This function therefore takes two arguments - a function to hook and a function to hook that one with. It stores the first 5 bytes of the hooked function in a buffer, then overwrites those first 5 bytes of the function with a relative jump to the hooking function. The remainder of the 10-byte buffer is a relative jump to the rest of the hooked function. This means calling the hooked function directly instead calls the hooking function, and the hooked function is instead accessed through the newly allocated stub code.
The address of the stub code is placed in eax as the return value.
At the end of our function, esp is incremented and then the stack is popped to give us a stack that looks like this:
PC = 0040144E
+----------+
| 004029f6 |
|----------|
| 761019A0 | // hooked CreateWindowExW
|----------|
| 00402551 |
+----------+ Bottom
And therefore control flow returns to 0x4029f6.
loc_4029f6
This is a small piece of code that saves the newly allocated stub that calls
into the real CreateWindowExW()
to the constant address 0x4029FC, and then
calls the hooking function again. This time, the CharUpperW()
function is
hooked with the function at address 0x4017be.
The control flow then returns to 0x402A17.
loc_402A17
Like before, the stub that calls into the real CharUpperW()
is saved to a
constant location, at 0x402A1D.
The function then starts to load 2-byte values into a space in the stack. Each
value is pushed, then immediately popped into eax and saved to the stack buffer.
This is likely done as an obfuscation technique. The last byte pushed is 0,
which is easy to miss in a disassembly because it breaks the pattern and just
calls xor eax, eax
before then putting eax into the stack space. The values
0x13 and 0x72014 are then pushed, and the address of our stack space is saved to
ecx, then function at 0x402B3C is called. ecx is the register used for
__fastcall
on Windows x86, so the stack buffer is our first parameter to the
function. Looking beyond this call, we see the buffer is then passed as the
lpWindowName argument to CreateWindowExW()
(which actually calls our hook).
It is therefore safe to assume that 0x402B3C is probably a string deobfuscation
function.
sub_402b3c
This function uses edi as a loop counter to iterate from 0 to the third
argument (0x13 in our case from above). For each element, it adds the 2nd
argument to the loop counter, and uses idiv
to get the remainder of the total
divided by 255 (the remainder is placed in edx in an idiv
). This remainder is
then XOR’d with the character for the given element. We can quite easily turn
this back into a C example with this information. Remember the string buffer is
2-bytes wide, and since we are on Windows this means it is a WCHAR
array. We
can also see from the assembly that the buffer is written back to, so it is an
INOUT argument, rather than the function returning anything.
If we give the sequence of bytes built up from the code starting at 0x402A22, as well as the right size of 0x13 and the key of 0x72014, then we get the string “KeygenMe V7 - Trial” back, which we see from running the program is the window title.
As mentioned, when this function returns from being called at 0x402AB7, its
output is then being used as input to the hook on CreateWindowExW()
, which is
called next.
sub_402551 (nWidth != 120)
This function starts by checking the nWidth parameter against 120. When it is
called, that is not the case, so it takes the failure branch, which simply
pushes all the arguments back and calls the CreateWindowExW()
stub, calling
the real function. This makes the main window and returns from the function.
loc_402A80
Returning back from the call to the real CreateWindowExW()
, the return value
is stored and the passed to the UpdateWindow()
function.
The rest of this function is an infinite loop of:
GetMessageW()
TranslateMessage()
DispatchMessageW()
This loop waits for an acts on window messages for the rest of the program’s
runtime. The function responsible for this is the one that was passed as the
lpfnWndProc argument to the RegisterClassExW()
function way back at the
beginning of the function, at 0x402121.
sub_402121
Since this function must conform to the WNDPROC
callback structure, we know
the arguments beforehand. This function is a large switch statement that
handles each event type it’s interested in. I won’t look at all of them, as
some of them do what you’d expect with no surprises.
The WM_CREATE event (the first branch, handled at 0x40235A) is notable because
it includes a call to the hooked CreateWindowExW()
with an nWidth of 120,
calling the alternate behaviour.
sub_402551 (nWidth == 120)
When the hooked CreateWindowExW()
is called with nWidth as 120, the branch at
the beginning is not taken. The majority of the resultant code is taken by
initialising and deobfuscating two strings. The first says “Nag Screen”, the
second we can assume is the text of the nag screen message.
Before that, a block of memory (158 bytes) pointed to on the stack is
initialised to zero, and the first four bytes of the block are set to the
address of the MessageBoxW()
function. The remaining bytes are filled with
the deobfuscated strings, with a four-byte gap between the function address and
the strings.
Two addresses are then pushed:
- 0x4028B1 - the other side of the jump in this function that calls the real
MessageBoxW()
function - 0x401469
ret
is then executed, returning control flow to 0x401469.
sub_401469
This function first calculates the difference between the function at 0x401451, and a nullsub immediately at 0x401468. This is calculating the size of the function, which is 23 bytes.
The function then calls OpenProcess
, opening itself with 0x43A as the
dwDesiredAccess field, which decodes to:
- PROCESS_CREATE_THREAD
- PROCESS_VM_OPERATION
- PROCESS_VM_READ
- PROCESS_VM_WRITE
- PROCESS_QUERY_INFORMATION
The function then uses VirtualAllocEx()
to allocate two pieces of memory
with protection PAGE_EXECUTE_READWRITE. The first is the size of the value
calculated earlier (23), the latter is 168 bytes. The addresses are determined
at runtime by passing NULL as the lpAddress parameter. The addresses are stored
in ebx and edi respectively.
The two buffers are then written to with WriteProcessMemory()
. The first has
the entirety of the function at 0x401451 written to it. The second writes
[ebp+8]
, which is our block of memory that was set up in the previous
function which contains a function address, a gap of four null bytes, and then
the two decoded strings.
CreateRemoteThread()
is then called, with the first buffer as the start
address, and the second as the parameter. The function then sleeps for 200
milliseconds.
From what has been seen here, this entire function call could effectively be replaced with a call to 0x401451, passing the block of memory as the parameter. Perhaps this is an obfuscation technique.
The function returns to the next address that was pushed. We have seen this
before, it is the other arm of the branch in the function at 0x402551, that
just calls into CreateWindowExW()
. From there the control flow goes back to
the WM_CREATE handler of the window event handler function, which is otherwise
uninteresting. We will return to other branches of that function in a second,
but first we need to look at the brief function called in a new thread here -
0x401451.
sub_401451
This is a small function that parses the 168-byte block given. We can see it
pulls the first 4 bytes - the function address to call - to ecx. It then
pulls [ecx+8]
, then [ecx+1E]
, then [ecx+4]
and pushes each, as well as
the constant 0x30. We can therefore assume this data contains three arguments
starting at those offsets. We could represent this data as:
// sizeof=0xA8
struct function_parameter_pack {
FARPROC func; // +00
DWORD arg0; // +04
UINT8 arg1[0x16]; // +08
UINT8 arg2[0x8A]; // +1E
};
And this function’s definition would then be:
void sub_401451(struct function_parameter_pack *f)
{
f->func(f->arg0, f->arg2, f->arg1, 0x30);
}
Recall that f->func
in this case is MessageBoxW()
. This function signature
exactly matches that, so perhaps this function only exists to obfuscate calls
to MessageBoxW()
. This call therefore translates to:
- hWnd = NULL (the message box has no owner window)
- lpText = The second, longer deobfuscated string (the nag screen message)
- lpCaption = The first deobfuscated string (the string “Nag Screen”)
- uType = MB_ICONEXCLAMATION
And sure enough, this call gives us the nag screen window.
Conclusion
Having looked at this function, only 2 remain unexplored - the hook for the
CharUpperW()
function at 0x4017BE, and the large function at 0x401856. We
also have some unexplored branches of the window event handling function.
These will all be looked at in part 2.
This initial look at the binary has revealed many obfuscation techniques, such as XOR-encoded strings, runtime function loading, hooked Win32 functions and calling functions by allocating new executable memory for them. These are all designed to make things tricky for static analysis, and as we go into the meat of the keygen algorithm it will be important to keep what has happened in this part in mind.