Disclaimer

Wednesday, August 30, 2023

"Reg Restore" Odyssey: Journey to Persistence And Evasion

The Windows Registry, a fundamental component of the Windows Operating System, empowers users to fine-tune system policies and manipulate low-level configuration settings. However, this registry's capabilities have also made it a target for exploitation by malicious actors and adversaries, who exploit this enigmatic database to carry out their malicious activities.

As indicated by the title, we will delve into how I discovered a method to leverage the functionalities of 'reg save' and 'reg restore' to establish persistence while adeptly evading detection measures. 

In this blog post I will cover the following subjects:

  • How it started? A Brief Overview
    • Challenge
  • Sysmon Registry Event Evasion
  • Proof of Concept (POC) Development
    • "reg save" and "reg restore" mechanism
    • parsing the registry backup file
      • RegF Header
      • HiveBin and Cell Record
  • References


How it started? A Brief Overview

While conducting research on registry persistence, I reached a point where I opted to generate a registry backup for my test registry. This approached help me to avoid the need for repetitive input. I accomplished this by employing the 'reg save' command, which allowed me to save the registry hive. However, upon looking to the contents of the registry backup file, an intriguing realization dawned upon me. Within the file, I discovered an inclusion of registry keys, their corresponding names, and the associated data values relevant to the registry I was actively working on. It dawned on me that by making modifications and then restoring this backup, I could effectively manipulate the registry, capitalizing on the capabilities of these two features within the 'reg.exe' tool..

Figure 1 illustrates the modification I did to the saved registry hive and the consequences of this alteration in Figure 2 after the restoration process.


figure 1 - registry saved hive modification
Figure 1 - registry saved hive modification


Figure 2 : after reg restore

Challenge

In this test, we can conclude that we have successfully identified a method to establish persistence through the utilization of 'reg save' and 'reg store.' However, a crucial next step involves finding an automated process to modify the saved registry hive. This is essential in order to create a persistence entry within the registry run key. Hence, the challenge at hand is to determine the methodology for effectively parsing the registry backup file.

Sysmon Registry Event Evasion


Before addressing the challenge presented in the initial sub-heading, I conducted an examination of the Sysmon registry events that are generated when saving and restoring a registry backup file. Surprisingly, no events were captured during these processes. This situation prompted me to investigate how Sysmon monitors registry events.

Sysmon (System Monitor) is actually a windows device driver installed as service that monitor and log system activity to the windows event log.

The device driver component of Sysmon is located within the .RSRC section of its file. Consequently, I extracted this component and loaded it into IDAPRO for analysis. During this examination, it became evident that the Sysmon driver utilizes the 'CmRegisterCallback()' callback function to effectively monitor registry events.

figure 3 : Registry callback function

Examining the recently renamed function "func_ExCallbackFunction()", we can gain insight into how Sysmon utilizes it to scrutinize registry operations of the REG_NOTIFY_CLASS type. In the most up-to-date iteration of this tool, the Sysmon driver effectively filters and monitors the following REG_NOTIFY_CLASS instances:
  • RegNtDeleteKey
  • RegNtRenameKey
  • RegNtPostCreateKey
  • RegNtPostDeleteKey 
  • RegNtPostSetValueKey 
figure 4 : REG_NOTIFY_CLASS filtering

In this particular scenario, we can deduce the reason behind Sysmon's inability to capture registry operations like "reg save" and "reg restore." This stems from the fact that the Sysmon driver is not actively monitoring and processing the REG_NOTIFY_CLASS associated with registry restoration and saving.

Now Let's develop our POC... :)

Proof of Concept (POC) Development

Prior to delving into the use case of my proof of concept (POC), I took the initiative to gain a comprehensive understanding of the inner workings behind two essential components

"reg save" and "reg restore" Mechanism: My primary focus was to grasp the underlying mechanisms of both "reg save" and "reg restore." This approach allowed me to emulate their functionalities within my code autonomously, eliminating the need for external invocations.

Upon inspecting the code of reg.exe, a notable revelation emerged: the initial step involves the adjustment of the process token's privilege. Initially, the "SeBackupPrivilege" is granted, enabling the preservation of the registry hive through the utilization of the RegSaveKeyExW() API. Subsequently, the focus shifts to the "SeRestorePrivilege" which is activated to facilitate the restoration of the registry backup file, accomplished by invoking the RegRestoreKeyW() API.

Parsing the Registry Backup File: Equally important was the task of developing a method to effectively parse the registry backup file. By knowing its structure, I could ensure seamless extraction and utilization of critical information contained within the file.

In terms of parsing the registry backup file, I didn't find any full documentation from Microsoft to help me parse this file type.  However, fortune favored me as numerous research efforts dedicated to deciphering this file type have been diligently conducted. These valuable insights have been compiled and are readily accessible within the reference sub-heading of this blog.

While an exhaustive exploration of each header within the registry backup structure isn't within the scope of my discussion, I'm inclined to shed light on the pivotal segments that I leveraged for parsing and subsequently modifying. These particular insights hold significance, as they played a crucial role in enabling both persistence and defense evasion strategies within the context of the registry backup.

RegF FileHeader

Also known to be the base block. this file header is 4096  bytes in length and contains information such as signature header "regf", Major/Minor version, file type, file format, root cell offset and structure in little endian form.


figure 5 : regf file header


HiveBin and Cell

the hive bin starts with "hbin" header and consist of cells. A hive bin header is 32 bytes in length.
after hive bin is the Cell. It is a variable in size depending on the cell data record type listed below:


figure 6: cell record type

for my POC I focus on the Registry key Node "nk" that contains structure field to check the existence of registry named key, total number of registry value count and if the registry sub key entry is already deleted.

Furthermore, I harnessed the  Registry Key Value "vk" header to extract and interpret individual registry value data. This approach allowed me to precisely correlate each piece of registry value data with the specific registry key under examination, enhancing the accuracy and integrity of the parsing process.

Below is the code snippet of my POC "RegREeper.exe" in enumerating all Registry value data in the registry hive file.

figure 7: registry value data enumeration


now we are all set and we're ready to prepare our POC use case with the following approach:

  1. Adjust Token Privilege SeBackupPrivilege to be able to save HKCU\\Software\\Microsoft\\Windows\\CurrentVersion\\Run registry hive.
  2. saved the registry hive to "save_reg.hive"
  3. Parse registry hive structure (save_reg.hive) to look for registry value key data string to be modify.
  4. compute the length of the registry value key data string during parsing, then used that length to generate random file name.
  5. dropped a copy of itself in c:\users\public\{random_filename}.exe
  6. create a copy of save_reg.hive -> mod_save_reg.hive
  7. modify the current registry value key data string of HKCU\\Software\\Microsoft\\Windows\\CurrentVersion\\Run with the file path of its file copy.
  8. Adjust Token Privilege to SeRestorePrivilege
  9. trigger RegRestore via RegRestoreKeyW() API.
 
and viola! we have a working POC that gain persistence by modifying registry entry in Registry Run keys that can outsmart Sysmon Registry Event Monitoring.




The POC code:

short demo:


References:























Friday, August 6, 2021

There Is More Than Meets The Eye - Analyzing Obfuscated WSHRAT Script

Nowadays it is really a common thing for a malware to have a crypter packer, obfuscation and encryption to hide its code from analyst, evade AV detections , bypassed emulation and so on. Aside from binary compiled malware, another interesting file type that can be easily to obfuscate is script compile code like JScript, VBscript, Autoit, powershell, python, golang and many more.

I know there are so many way to de-obfuscate or analyze a malicious script like using the browser as a debugger, using AMSI technology , wscript.exe to trigger a debug mode "/X" to a target script file or deep dive analysis and try to work with some tools or code some script.

Today I will share some tip how I analyzed the obfuscation technique of WSHRAT JS script with the help of python script to make the code more clear and see how it keeps several payload file on its code body.


Let's Start!! :)

First Stager - The Initial loader

The First Stage of WSHRAT script show interesting obfuscation that it used even in its second stage. As we can see in the figure 1, It started with a "squrr3L" function which will de-obfuscate each initialized strings within the code that will executed using EVAL() command.

The "squrr3L" function will replace the string "{}" that contains an index number pertaining to the index of string counter part that was pass to it as parameter. So in the figure 1. we can see "Array.prototype.tp_l1nk" is a big string contains "{}" which is either {0},{1},{2} that pertains to the parameter "F", "P", "x" pass to it so we can say that {0} = "F", {1}="P" and {2}="x". 


Figure 1 - First Stager The Loader

Once we replaced all the needed strings specially in variable "Array.prototype.tp_l1nk" which is base 64 encoded string of the next stager, we can decode it with our tool of choice. 

Second Stager - The Payload Executioner

After decoding the base-64 encoded string, The code is really a mess at first glance but don't worry as you work to figuring out how it works little by little it is not hard as what we expected. At the start of the code you will notice right away a big array list of hex string values named as "dKEW_qS[]".  This big chunk of hex encoded string is the gem of this script once converted to readable string and place on the right place. below is the code snippet of the hex string array.

figure 2 - hex encoded string array

And like what we've expected, the array list "dKEW_qS" was used as a reference to index the hex encoded string it needed to initialize, process or execute.

figure 3 - referencing the Hex Encoded String Array

Attacking The Problem - De-Obfuscation

Now we know the problem and how the obfuscations work. we can start analyzing it either manually checking or referencing the hex encoded string as you go to its code or "if you're lazy like me" you can work with simple script. 

So the goal of my script is the following: 
1. read each line of the wshrat script malware and check if there is a array referencing the  "dKEW_qS[]"
2. if yes grab the index number, locate the hex encoded string and then normalized it
3. copy that line into the output file as we as create a comment below the original code line 
4. if not just copy the original line


below is the example console output of the script I created showing how it replace all the index array to its normalized string.

figure 4 - console out of script

While below is the output file generated by the script I created where it creates a comment for each line having the reference array with the normalize string counter part to it. now it is easier to read right! ;)

figure 5 - before and after the execution of the script

I know the script is not so fancy but the help for analysis is really worth it. After having the output of my script I understand some its feature function like RDP (), keylogger() and reverseproxy() better. I thought all those function will be done within the script but nope. It will decode a base-64 encoded .net executable file for each of those payload and run it :).

figure 6 - getRDP function before and after the execution of the script

figure 7 - getKeyLogger function before and after the execution of the script

figure 8 - getReverseProxy function before and after the execution of the script

And some noteworthy behavior can be easily seen now after the normalizing it.

figure 9 - UAC bypass and defense evasion technique

Conclusion:

In this Blog We learned that sometimes if you have time it is really worth it to go on deep dive on some technique made by malware especially the obfuscation because at the end of the day you will learn something new. 

Hashes:


Monday, February 22, 2021

Gh0stRat Anti-Debugging : Nested SEH (try - catch) to Decrypt and Load its Payload

 SEH tricks is not a new Anti-Debugging trick. So many malware already used this to make the manual debugging of its code time consuming and confusing. Today I will share how Gh0strat malware make use of nested SEH exception (try{} catch) as anti-debugging trick to hide its decryption routine.

This article is not to tackle the full C++ Exception Internals, but to share how IDAPRO really helps me in analyzing this type of anti-debugging tricks statically. :)

So lets start!!!

SEH:

Structure Exception handler (SEH) is one of the classic Anti-debugging tricks used by malware. where it tries to abuse the mechanism provided by operating system in managing exceptional situation like reference to a non-existence address pointer or execution of code in a read-only page.

This is done usually  by locating the pointer to SEH linked list in the stack known to be the SEH frame. The current SEH frame address is located in 0x00 offset relative to the FS (x32 bit) or GS selection (x64 bit).

figure 1: FS[0] of x32 bit OS

figure 2: The EXCEPTION_REGISTRATION_RECORD in FS[0]

When the exception is triggered, control is transfer to the current SEH handler where it will return one of the _EXCEPTION_DISPOSITION members.

In this Gh0strat variant it used nested SEH (try{} catch{}) that serve as anti-debugging tricks to make the debugging more confusing or let say more time consuming if analyst didn't notice the SEH.

Gh0srat: Nested SEH to decrypt its payload:

The sample we will use here contains a big data section where the encrypted gh0strat payload located. we will notice that using DIE tool or PE-bear that visualized the size of each section with quite high entropy same as text section.

figure 3: high entropy of data section

we all know there are so many faster way to bypassed this anti-debugging technique like monitoring the TIB offset 0x0 dynamically for next SEH or dumping process. In our case I will just want to share how IDA PRO will help you a lot in this case in traversing "FuncInfo" structure since IDAPRO resolved most of this SEH structure.

when you try to load the sample in debugger and breakpoint on some API let say, you may encounter some exception error shown in figure below. This is also a good hint that it may use SEH technique. 

figure 4: exception during debugging

by further checking its code using IDAPRO I notice that it uses nested SEH. yes a nested try{} catch{} exception handler to decrypt its payload. at the entry point of the malware code you will notice right away the first exception handler function registered to FS:0. Exception will be trigger by calling "call    __CxxThrowException" API. 

figure 5: first SEH in malware entrypoint

ehFuncInfo or the exception handler function registered in FS:0 contains some structure that may help us to figure out statically which exception handler function may be call upon the exception is trigger.

I really recommend to read this great presentation of hexblog regarding the Exception and RTTI:
https://www.hexblog.com/wp-content/uploads/2012/06/Recon-2012-Skochinsky-Compiler-Internals.pdf

The ehFuncInfo is a object structure that may lead you to the "AddressOfHandler" which is a address or a function address that will handle the exception encounter of the current thread.

IDAPRO really did a good job to give you some hint how to traverse that structure and lead you the said structure member of HandlerType. FuncInfo structure contains several member so I will just focus on the member that helps me to decrypt the payload.

Below is a simple structure starting from "FuncInfo" that may help you to look for the AddressOfHandler field member of HandlerType structure. 

figure 6: Traversing AddressOfhandler


The figure 6 shows that FuncInfo structure contains TryBlockMap field. this field is another structure object that contains HandlerArray field structure that holds the AddressOfHandler field. so to make use of this structure in our sample lets try to traverse the first SEH in malware entry point.

figure 7: Traversing the AddressOfHandler of SEH in malware entry point


We saw that the possible Address that will handle the exception is in 0x40243d. It works in dynamically test I did with x64dbg where I put break point on this address after the exception then press skip exception shift+f9.

figure 8: AddressOfHandler was triggered

if we follow the call function 0x402200 pushing string address "Shellex" as a parameter. you will notice again that it use another SEH to execute piece of its code. Not like the first SEH, this SEH contains 9 tryblock and HandlerOfAddress like the figure below.

figure 9: multiple try Block Map

Parsing ehFuncInfo structure Using Ida python:

In this case I decided to use IdaPython to parse all the FrameHandler ehFuncInfo structure to locate all AddressOfHandler field available for all tryBlockMap entries and add it as a code reference comment in its IDB. This approach help me to figure out where the decryption routine and learn multi line comment in idapython :).

the script is available here:


figure 10A: before running the script

figure 10B: IDB after running the script
































now with this comment we can verify all possible AddressOfHandler in each tryBlockMap entry to locate the decryption routine. Like the figure above, the first AddressOfHandler isa function waiting for the decryption key, size of the encrypted payload and the address of the encrypted payload.

figure 11: decryption routine


and once you decrypt the payload using this simple xor decryption routine. you can see right away some note worthy string of gh0strat like keylogging, creating services, regrun, download files, backdoor and etc.

figure 12: strings upon decryption

Conclusion:

In this article we just focus on some basic internals of SEH frameHandler and how to look for all possible HandlerOfAddress that may executed upon the trigger of registered SEH. we also learned how IDAPRO did a really good job in giving you all the needed structure for try block entries where you can use IDApython to make your static analysis more easier. :)

IOC:

yara:


import "pe"

rule gh0st_rat_loader {
    meta:
        author =  "tcontre"
        description = "detecting gh0strat_loader"
        date =  "2021-02-22"
				sha256 = "70ac339c41eb7a3f868736f98afa311674da61ae12164042e44d6e641338ff1f"

    strings:
        $mz = { 4d 5a }

        $code = { 40 33 FF 89 45 E8 57 8A 04 10 8A 14 0E 32 D0 88 14 0E FF 15 ?? ?? ?? ?? 8B C6 B9 ?? 00 00 00 }
        $str1 = "Shellex"
        $str2 = "VirtualProtect"
       
    
    condition:
        ($mz at 0) and $code and all of ($str*)

    }
    
rule gh0st_rat_payload {
    meta:
        author =  "tcontre"
        description = "detecting gh0strat_payload in memory without MZ header in memory"
        date =  "2021-02-22"
				sha256 = "edffd5fc8eb86e2b20dd44e0482b97f74666edc2ec52966be19a6fe43358a5db"

    strings:
		    $dos = "DOS mode"	
		    $av_str1 = "f-secure.exe"	
		    $av_str2 = "Mcshield.exe"
		    $av_str3 = "Sunbelt"
		    $av_str4 = "baiduSafeTray.exe"
		    
		    $clsid = "{4D36E972-E325-11CE-BFC1-08002BE10318}"
		    $s1 = "[WIN]"
		    $s2 = "[Print Screen]"
		    $s3 = "Shellex"
		    $s4 = "HARDWARE\\DESCRIPTION\\System\\CentralProcessor\\0"
		    $s5 = "%s\\%d.bak"

    
    condition:
        ($dos at 0x6c) and 2 of ($av_str*) and 4 of ($s*) and $clsid

    }

    

References:
https://www.hexblog.com/wp-content/uploads/2012/06/Recon-2012-Skochinsky-Compiler-Internals.pdf

Monday, January 18, 2021

Extracting Shellcode in ICEID .PNG Steganography

 In this past few days I stumble to some new and old variant of ICEID malware that uses .png steganography to hide and execute its encrypted shellcode. In this article I will share how the structure of the Iceid png payload look like and how to extract its encrypted shellcode.

Loader Compression:

same as the other malware, IceID Loader changes its Crypter to execute its main module in memory. From old variant where the encrypted code stub and decryption module is place in the RSRC section as a data .rsrc entry (RC4 encrypted) to applib compression like the figure below.


figure 1: The aplib decompressed module of ICEID

PNG Header:

Before we deal with the ICEID PNG steganography, I think it is a good idea to have some preview what PNG file header format is. It will give as a clear preview how ICEID parse the .PNG header and look for its encrypted shellcode.

The PNG file format started with 8-bytes signature header "89 50 4e 47 0d 0a 1a 0a". after this header is a chunk structure or series of chunk structures that contains either a "critical chunks" like "IHDR", "IDAT" and etc... or "Ancillary chunks" that may contain some attribute related to the color, pixel or metadata of the png file like "sRGB", "gAMA" and many more.

each PNG chunks layout consist of 4 parts or 4 structure member. the figure below show the 4 parts in IHDR chunk type.


figure 2: PNG File header Format

ICEID PNG Module Decryptor:

Now that we have some insight how PNG file format look like, lets dive in to the ICEID PNG decryptor module ("let's call it PNG module") that we already extracted earlier in the memory. This part is really interesting especially in parsing the header. :)

The PNG module start by executing a thread that will do the following:
1. decrypt its data section (C&C url link) using same approach how it decrypt the shellcode in PNG payload file.
2. it will check the existence of the PNG file in %appdata%<randomname>/, if not
3.  it will try to download it to its C&C server
4. then it will parse and check the png file to extract and execute its shellcode.

The first task is decrypting the C&C server URL link that are place in data section that are encrypted in RC4 algorithm. the structure of data it used in decrypting this data section can be seen before a call  function that will do the decryption part.


figure 3: Encrypted Data Structure


The first 8 bytes of the encrypted data section is the RC4 key and the rest is the encrypted data.

figure 4: decrypting C&C server URL link

Next it will create a random name folder in %appdata% using the "username" of the infected machine. 
with the use of RDTSC command to generate random character. If the module didn't find the png payload in the said folder it will try to contact the C&C server to download it.

figure 5: looking for the PNG payload file

Parsing PNG Header:

after reading the PNG and save it in the memory, it will start the checking in offset 0x8 (skipping the PNG header) which is the "chunk_data_length" of the first chunk type in the header which in our case is "IHDR". The way how it parse the header to look for "IDAT" chunk_type structure is by adding:

next_chunk_type_struct = size(chunk_data) + chunk_data_length (4 bytes) + chunk_type (4byes) + chunk_crc32 (4 bytes)

except for the start chunk_type structure where you need to include or add the PNG header size which 8 bytes.

for this topic, I created a simple python script that will parse this header and give you the basic information about the header. it also parse the chunk_data but I place it in the debug.log of this script.


figure 6: parsing the header


Byte Flag and Decryption Key:

as soon as it found the IDAT chunk_type structure it will check first if the chunk_data_length > 5 then it will skip the chunk_data_length , chunk_type and chunk_crc32 by adding 0x0c to the current pointer of the "IDAT" chunk header. the byte in this position looks like a validity flag of the PNG. If this byte is zero, the PNG module will check another value which is one of the parameter to the function that parse the png header which is also zero, so in this case it will exit the flag. 
figure 7: byte flag



after this byte is the 8 byte RC4 decryption key followed by the encrypted shellcode using this RC4 key. the python script I mentioned earlier will parse the RC4 key, extract the shellcode and check the shellcode header and entrypoint by dis-assembling it using capstone python library.

figure 8: script tool parser

Conclusion:

In this analysis we learned how PNG file can be used as a weapon to hide the malicious code and how malware keeps on updating their tools to bypassed detection.

Also thanks to the community for sharing the samples :)

Samples:

sha1: 9a07f8513844e7d3f96c99024fffe6e891338424
sha1: 1ab6006621c354649217a011cd7ca8eb357c3db4
sha1: c1faa9cb4aa7779028008375e7932051ee786a52
sha1: 481bc0cbdcae1cd40b70b93388bf4086781b44b4

https://www.virustotal.com/gui/file/45520a22cdf580f091ae46c45be318c3bb4d3e41d161ba8326a2e29f30c025d4/details

https://www.virustotal.com/gui/file/e6e0adcc94c3c4979ea1659c7125a11aa7cdabe24a36f63bfe1f2aeee2c5d3a1/detection

https://www.virustotal.com/gui/file/cc1030c4c7486f5295444acb205fa9c9947ad41427b6b181d74e7e5fe4e6f8a9/details

https://www.virustotal.com/gui/file/f6ea81aaf9a07e24a82b07254a8ed4fcf63d5a8e6ea7b57062f4c5baf9ef8bf2/detection

References:

https://en.wikipedia.org/wiki/Portable_Network_Graphics
http://www.libpng.org/pub/png/spec/1.2/PNG-Structure.html
https://blog.malwarebytes.com/threat-analysis/2019/12/new-version-of-icedid-trojan-uses-steganographic-payloads/
https://www.malware-traffic-analysis.net/2020/12/11/index.html



Thursday, November 5, 2020

Interesting FormBook Crypter - unconventional way to store encrypted data

This FORMBOOK CRYPTER loader contain a lot of interesting feature to bypassed sandbox, obfuscate its code and many more. It also show a unique way to store and parse its encrypted data to execute. 

so let's start :).

FORMBOOK CRYPTER LOADER (ANTI-VM):

After decrypting some shellcode in the memory it will use several technique to check if its code is running in a virtual machine or not. The screenshot below show 3 techniques it use.
 
  • ANTI-VM I : it use the cpuid with EAX=0x40000000 as input to determined the hypervisor brandname to check if it is running in a virtualize environment
  • ANTI-VM II:  use cpuid with EAX=1 as an input to check the 31st bit of its return value in ECX if set or not. if it is set then it is in VM.
  • ANTI-VM III: check the existence of some known driver component of the virtual machine. for this example it checks the existence of the vmmouse driver in the machine.

figure 1: Different ANTI-VM technique checks


FORMBOOK CRYPTER LOADER (ANTI-SANDBOX):

 It also has some feature to check if its code is running in a sandbox by using the 2 technique shown below.
  • ANTI-SANDBOX 1 : check the file path of its running code using GetModuleFileName API if it contains "sample", "sandox" or "malware". if yes exit the process
  • ANTI-SANDBOX 2: checks the existence of the sbiedll.dll that are component of known sandbox.
figure 2: Anti-Sandbox technique

FORMBOOK CRYPTER LOADER (PROCESS CHECK):

It also enumerate all the process running to the machine and try to check the existence of known debugging tools process if it is exist, if yes exit the process. For AV related process and services, it tries to create a counter how many AV product it saw in the machine max of 2 (it seems like it checks for a testing machine that contain several AV product on it).

below is the list of the process it checks related to malware analysis tools and AV product:

figure 3: Process checking to evade malware lab environment

DECRYPTING THE FORMBOOK IN RSRC:

The next thing it will do is to decrypt the encrypted Formbook malware in its resource section. It is done by looking to 2 entry in rsrc section. The first entry is with rsrc ID "14d" with rsrc type of 17 "RT_DLGINCLUDE" that contain the 16 bytes rc4 key to decryp the rc4 key to decrypt the FORMBOOK.
 
figure 4: decrypting the rc4 key for FORMBOOK
 
 
Once the Rc4 key was parse, it will decrypt the encrypted formbook malware, it will load another resource entry with rsrc id "3e8" type "2". Then it will remove 3 dummy bytes to the data blob before decrypting it using rc4 algorithm.
 
figure 5: decrypting FORMBOOK

 FORMBOOK MZ HEADER:

one of the interesting part of this formbook variant is that the MZ header is used as shellcode to jump to entrypoint of the executable. this technique is also seen in cobaltstrike variant show in my previous blog . https://tccontre.blogspot.com/2019/11/cobaltstrike-beacondll-your-not.html
 
 
figure 6: MZ header shellcode

INTERESTING STORING AND PARSING ENCRYPTED DATA:

The Formbook obfuscate its code. One interesting feature of this is how it store and parse its needed bytes to decrypt or to hash to perform its task. Malware commonly used "stack string technique" to initialized its string or data in stack or in an allocated memory space like the screenshot we saw in anti-vm and anti-sandbox headings of this post. 

But for this variant it used another technique where it save its needed bytes in a code like structure, then it will parse each instruction to check its opcode if it will passed its requirements, if yes it will parse the operand or opcode that is part of its needed bytes to decrypt or to hash.
 
requirement:
  I. if opcode is 0x40-0x5f just grab the opcode itself.
 II. if opcode is 0x70-0x7f which is mostly a conditional jump mnemonics then skip that instruction.
III. (if opcode - 0x40 > 0x1f) and (opcode - 0x70 > 0x0f) then it will check what opcode is that (opcode range from 0x00 to 0xFF) to know what other opcode or how big is the operand it will parsed.

figure 7.A: initial opcode it tries to grab and opcode it skip

figure 7.B: FormBook opcode condition for parsing its data


figure 8: the parse stored data that either to be decrypt or hash it.

And also not all stored data that it will parse to its code will be decrypted, some of those stored data is designed to compute sha1 hash that will serve as the decryption key (rc4 algortihm) to decrypt another blob of code.


 _BYTE *__cdecl Func_DecryptBytesGrabbed(int DestBuff)
{
  int VA_41C3A6; // eax
  int VA_41C50B; // eax
  int VA_41BEF1; // eax
  char dest_buff; // [esp+Ch] [ebp-140h] BYREF
  char v6[215]; // [esp+Dh] [ebp-13Fh] BYREF
  _DWORD sha1_ctx[26]; // [esp+E4h] [ebp-68h] BYREF

  dest_buff = 0;
  Func_MemSet(v6, 0, 0xD4u);
  VA_41C3A6 = sub_41C3A1();
  Func_GrabEncryptedData(&dest_buff, VA_41C3A6 + 2, 0xD3u);
  VA_41C50B = sub_41C506();
  Func_GrabEncryptedData(DestBuff + 0x444, VA_41C50B + 2, 0x2F0u);
  VA_41BEF1 = sub_41BEEC();
  Func_GrabEncryptedData(DestBuff + 0x7B8, VA_41BEF1 + 2, 0x14u);
  Func_SHA1_Init(sha1_ctx);
  Func_Sha1_Update(sha1_ctx, &dest_buff, 0xD3);
  Func_Sha1_Final(sha1_ctx);
  Func_GrabNeededOpcode(DestBuff + 0x7A4, sha1_ctx, 20);
  Func_DecryptWithRc4((DestBuff + 0x444), 0x2F0u, DestBuff + 0x7A4);
  Func_SHA1_Init(sha1_ctx);
  Func_Sha1_Update(sha1_ctx, (DestBuff + 0x7B8), 0x14);
  Func_Sha1_Final(sha1_ctx);
  Func_DecryptWithRc4((DestBuff + 0x444), 0x2F0u, sha1_ctx);
  Func_SHA1_Init(sha1_ctx);
  Func_Sha1_Update(sha1_ctx, (DestBuff + 0x444), 752);
  Func_Sha1_Final(sha1_ctx);
  return Func_DecryptWithRc4((DestBuff + 0x7B8), 0x14u, sha1_ctx);
}  tag


SAMPLES:

filename: Formbook_loader.bin
md5: 65880d23eb6051a1604707371ebb6d2c
sha1: 3f5d0833adbd39715f1d45f1a3c8982c52519bc1
sha256: ac2e9615b368e00fb4bf4d5180bbfc0d6fb7bbce3fa1af603d346d7a8f2450e5
 
 

filename: formbook.bin
md5: df93eecd1799f9c9c674b8cdb2f1dad1
sha1: e66c893f39c7553f59a5381d23a5c65e5c2e84f7
sha256: 5d7eba73b4d29ee17529511bb8b0745e658bf2adfcae57bdfa8d0870f4732a18

YARA RULES:


 import "pe"

rule formbook_loader_crypter {
    meta:
        author =  "tcontre"
        description = "detecting formbook-loader-crypter malware"
        date =  "2020-11-05"
        sha256 = "ac2e9615b368e00fb4bf4d5180bbfc0d6fb7bbce3fa1af603d346d7a8f2450e5"

    strings:
        $mz = { 4d 5a }
 
        $dec = { 03 CE 8A 03 88 45 F9 8B C6 51 B9 03 00 00 00 33 D2 F7 F1 59 85 D2 75 14 8A 45 F9 32 45 FA 88 01 8A 55 FB 8B C1 E8 39 01 00 00 EB 05 8A 45 F9 88}
        $rc4_key = {12 2D 13 EF 23 E2 7F 4B 70 19 C7 F0 4B 68 75 50}
     
    condition:
        ($mz at 0) and ($dec ) or ($rc4_key)
 
    }
    
rule formbook_crypter {
    meta:
        author =  "tcontre"
        description = "detecting formbook-crypter malware"
        date =  "2020-11-05"
        sha256 = "5d7eba73b4d29ee17529511bb8b0745e658bf2adfcae57bdfa8d0870f4732a18"

    strings:
        $mz = { 4d 5a }
 
        $shell = { 4D 5A 45 52 E8 00 00 00 00 58 83 E8 09 8B C8 83 C0 3C 8B 00 03 C1 83 C0 28 03 08 FF E1 90 00 00}
                $opcode_check = {8B 4D FC 8A 04 39 03 CF 88 45 F4 8D 50 C0 80 FA 1F 77 18 6A 01 51 8D 04 1E 50 E8 ?? ?? ?? ?? 46 83 C4 0C FF 45 FC 89 75 F8 EB 25 2C 70 3C 0F 77 }
     
    condition:
        ($mz at 0) and ($shell at 0) or ($opcode_check)
 
    }  tag

"Reg Restore" Odyssey: Journey to Persistence And Evasion

The Windows Registry, a fundamental component of the Windows Operating System, empowers users to fine-tune system policies and manipulate lo...