Recently, at the SAS conference I talked about "Funky malware formats"—atypical executable formats used by malware that are only loaded by proprietary loaders. Malware authors use them in order to make static detection more difficult, because custom formats are not recognized as executable by AV scanners.

Using atypical formats may also slow down the analysis process because the file can't be parsed out of the box by typical tools. Instead, we need to write custom loaders in order to analyze them freely.

Last year, we described one such format in a post about Hidden Bee. This time, we want to introduce you to another case that we discussed at the SAS Conference. It is a sample of Ocean Lotus, also known as APT 32, a threat group associated with Vietnam.


49a2505d54c83a65bb4d716a27438ed8f065c709 - the main executable

Special thanks to Minh-Triet Pham Tran for providing the material.


The sample comes with two elements—BLOB and CAB—that are both executables in the same unknown format. The custom format is achieved by conversion from PE format (we can guess it by observing some artifacts typical for PE files, i.e. the manifest) However, the header is fully custom, and the way of loading it has no resemblance with PE. Some of the information from a typical PE (for example, the layout of the sections) is not preserved: sections are shuffled.


This sample is from June 10, 2017, from the following email:

Content of the phishing email, along with its attachment

The title "Sổ tay vấn đề pháp lý cho các nhà hoạt động nhân quyền" translates to: "Handbook of legal issues for human rights activists." It's a subject line for a spear phishing campaign targeting Vietnamese activists.

The malicious sample was delivered as an attachment to the email: a zipped executable. The icon tried to imitate a PDF (FoxitPDF reader).

An executable with FoxitFDF icon

Behavioral analysis

After being run, the sample copies itself into %TEMP%, unpacks, and launches the decoy PDF.

The main executable and the decoy copied to the Temp folder

While the user is busy reading the launched document, the dropper unpacks the real payload. It is dropped into C:\ProgramData\Microsoft Help:

All the elements of the malware unpacked

The dropper executable is deleted afterwards.

The malware manages to bypass UAC at default level. We can see the application sporder.exe running with elevated privileges.
Persistence is provided by a simple Run key, leading to the dropped script:

Added run key (view from Sysinternals Autoruns)

The interesting factor is that the sample has an "expiry date" after which the installer no longer runs.


The main executable sporder.exe is packed with UPX. It imports the DLL SPORDER.dll:

Import table of SPORDER.exe (view from PE-bear)

SPORDER.dll imports another of the dropped DLLs, hp6000.dll:

Import table of SPORDER.exe (view from PE-bear)

The key malware functionality is, however, not provided by any of the dropped PE files. They are just used as loaders.

As it turns out, the core is hidden in two unknown files: BLOB and CAB.

Custom formats

The files with extensions BLOB and CAB are obfuscated with XOR. After decoding them, we notice some readable strings of code. However, none of them are valid PE files, and we cannot find any of the typical headers.


The BLOB file is obfuscated by XOR. We can see the repeating pattern and use it as an XOR key:

SPORDER.blob (original version), the repeating pattern is selected

As a result, we get the following clear version: 2e68afae82c1c299e886ab0b6b185658

BLOB's header:

The BLOB file looks like a processed PE file, however, its sections appear to be in swapped order. The first section seems to be .data, instead of .text.

We can see visible artifacts from the BZIP library and C++ standard library.


The CAB file is obfuscated with XOR in a similar way, but with a different key:

When we apply the key, we get an analogical clear version: b3f9a8adf0929b2a37db7b396d231110

This sample also has a custom header, which does not resemble the PE header. However, we found sections inside that are typical for PE files, for example, a manifest.


As it turned out, both files are loaded by hp6000.dll: 67b8d21e79018f1ab1b31e1aba16d201

The loading function is executed in an obfuscated way: when the DllMain is executed, it patches the  main executable that loaded the DLL.

First, the file name of the current module is retrieved. Then, the file is read and the address of the entry point is fetched. Then, the analogical module that is loaded in the memory is set as an executable:

Using VirtualProtect to make the main module writable

Finally, the bytes are patched so that the entry point will redirect back to the appropriate function in the loading DLL:

Patching the entry point of the main module, byte by byte

This is how the entry point of the main module looks after the patch is applied:

The Entry Point of the main module (sporder.exe) after patching

We see that the Virtual Address (RVA 0x1210 + DLL loading base) of the function within the DLL is moved to EAX, and then the EAX is used as a jump target.

The function that starts at RVA 0x1210 is a loader for BLOB and CAB:

Beginning of the loading function

This redirection works, thanks to the fact that when the executable is loaded into the memory, before the Entry Point of the main module is hit, all the DLLs that are in its Import Table are loaded, and the DllMain of each is called. Just after the DLLs are loaded, the execution of the main executable starts. And in our case, the patched entry point redirects back to the DLL.

Inside the function loading BLOB and CAB:

The function loading BLOB and CAB

As you can see, the CAB file is loaded first:

Executing the function loading CAB file (unconditional)

Further, we see this function retrieving some environmental variable. This variable is used to store the state of the application, and is shared between consecutive executions. Depending on this state, one of multiple execution paths can be taken.

The name of the variable is created by concatenating:

  1. hardcoded string: L"Local\\{076B1DB0-2C01-45A5-BD0A-0CF5D6410DCB}"
  2. the name of the executable
  3. a local username
Setting the variable name

The content variable may be one of the following: '@', '*',':'. If it is empty, the first value '@' is set. Those variables are translated to particular states that control the flow.

  • '@' -> state 1
  • '*' -> state 2
  • ':' -> state 3

The main process is restarted on each state change. Finally, the state 3 creates mutex and loads the file with the BLOB extension.

Final state: setting the mutex and loading the BLOB

The mutex name is the same as the variable name, but with a suffix "_M" added:

Setting the mutex

While the application runs, we can see the BLOB being loaded in executable form inside the main module's memory:

Memory of the sporder.exe, view from Process Hacker

By comparing the format that is loaded in the memory with the format that is stored on the disk, we can see that the beginning and the end of the BLOB is skipped in the loading process. So, we can guess that those parts are some headers that contains the information necessary for loading, but not for execution. The header at the beginning of the file will be referenced as Header1, and the one at the end (footer) will be referenced as Header2.

The Header2 file in the memory vs. its equivalent on the disk:

Comparing the memory dump with the raw file

We also found that some of the addresses were relocated (the new Image Base was added).

Reversing the reversed PE

The files with both extensions CAB and BLOB are loaded by the same function:

View from IFL (Interactive Functions List)

The core of the loader is in the following function:

The loading function

This is the function that we need to analyze in order to make sense out of the custom format.

Let's take a look at the loading process itself.

First DWORD of the Header1 is a checksum (it will be used later for validation of the decoded module). Then, we have two DWORDs that are used as an XOR key. Once they are fetched, the rest of the header is decoded.

Example: decoding the CAB file

After applying the key, we get the content of the file in its clear form. But before the module will continue to load, the checksum from the header must be compared with the actual one, calculated with the help of a custom formula:

Checksum calculation algorithm

The next value from the headers is used in the formula calculating the size for loading the executable part of the module. In the currently analyzed case (the CAB file), it is 0x17000:

Header 1 at the beginning of the CAB file, decoded

So, 0x17000 + 0x2000 is the size of the memory that will be allocated for the payload.


Example (from CAB file):


Then, 0x17000 bytes of the payload is copied, but the beginning containing the Header1 is skipped (the first 16 bytes).

After the module content is copied, Header2 is used to continue loading.

Looking at Header2, we can see some similarities with Header1.

The initial DWORD is an Entry Point of the module (DllMain). It will be loaded soon after the full module is loaded:

The address in the loader where the Enty Point of the loaded module is called

Then, we have a value that is used in a formula calculating the size of the memory to be allocated. The new memory region that is being allocated this time is used for the imports that are going to be loaded (the full process will be explained further).

Conceptually, we can divide Header 2 into two parts.

First comes a prolog that contains two DWORD values. Example from the currently-analyzed CAB file:

Header2 (at the end of the CAB file) - prolog is hilighted
  • val[0] = 0x21A0 -> RVA of the DllMain
  • val[1] = 0x013D -> val[1]*8+0x400 -> size of the next area to allocate

Then there is a list of records of a custom type. Each record represents a different piece of information that is necessary for loading the module. They are identified by the type ID that is represented by a DWORD at the beginning of the record.

Header2 (at the end of the CAB file) - records are hilighted


Type 1 stands for relocation. It has one DWORD as an argument. It is an address that needs to be relocated.

typedef struct {
	DWORD reloc_field;
} reloc_t;
Parsing of the type 1

We can see how the field is used to relocate the address. Example: filling the address at 0x8590:

The address pointed by the relocation record is relocated to the base at which the module was loaded


Type 2 stands for an exported function. The pointed address is stored on the list in order to be called later, after the loading finished. This record has three DWORD parameters.

typedef struct {
DWORD count;
DWORD entry_rva;
DWORD name_rva;
} entry_point_t;

Example of the record of type 2:

Parsing of the type 2

Address to be stored: params[1] = 0x00001030

Record of the type 2 in the original file

By observing the execution flow, we can confirm that indeed the stored function is being called after the loading process finishes:

The address in the loader where the CAB module's export is being called

Exported functions may be stored along with their names.


Type 3 stands for imports. It has four DWORD parameters.

typedef struct {
DWORD type;
DWORD dll_rva;
DWORD func_rva;
DWORD iat_rva;
} import_t;
Parsing of the type 3

Example of a chunk responsible for encoding imports:

Record of the type 3 in the original file

Type: params[0] = 0x00000002 - means the function will be imported by name, meaning of all the possible types of this record.

Address of the DLL: params[1] = 0x0107DA

DLL name

Address of the import: params[2] = 0x010774

Import name

In contrast to PE format, the address of the imported function is not loaded into the main module. Instead, it is written into the separate executable area (in the given example it is written at VA: 0x00240001):

Filled import

And then, the address where the import was filled is filled back in the main module. The address in the main module that needs to be filled is specified by the last parameter of this record. In the given example, chunk[3] = 0x0000E014 is being filled by 0x00240001:

Filling the address redirecting to the import

Atypical IAT

The functions from the embedded list are for a loader, however, as mentioned earlier, the addresses are not filled in a normal IAT, typical for PE format. Rather, all are filled as a list of jumps stored in a newly-allocated memory page.

Generated list of jumps, leading to the imported functions

The import loading function not only fills the address, but also emits the necessary code for the jump:

Address of the imported function is retrieved and written into the emitted jump
Meaning of the type field

The import record has a field type, that can have one of the following values: 1,2,3,4.

The 1 and 2 are the most important: They are used for loading the imports. 1 stands for loading by ordinals, 2 for loading by name. The remaining 3 and 4 are used for cleanup of the fields that are no longer needed. 3 erases import name, 4 erases DLL name.

Fragment of code responsible for loading the functions

When the record of the type 3 or 4 occurs, the pointer in the IAT area is still incremented, so as a result we can see some gaps between the functions records:

Gaps between the jumps

Functionality of the custom files

The CAB file is another installer that provides persistence to the whole package by creating a service:

"C:\Windows\system32\wscript.exe" /B /nologo "C:\Users\tester\Desktop\mod\sporder.vbs"

Created service

I also generate the VBS script that is dropped:

Dropped script

The CAB file is loaded first, just to install the malware, and then deleted.

All the espionage-related features are performed by the BLOB that is loaded later and kept persistent in the memory of the loader.

In addition to being in a custom format, BLOB is also heavily obfuscated.

We can observe its attempts to connect to one of the CnCs:

Attempt of connecting to the CnC : 443 :3389 : 44818 : 80 : 44818 : 44818 : 9091 : 9091 : 3389

Some of those domains are known from previous reports on Ocean Lotus, i.e. [the Cyclance white paper].

Ocean Lotus: a creative APT

Ocean Lotus often surprises researchers with its creative obfuscation techniques. Recently, a different sample of Ocean Lotus was found using steganography to hide their executables (you can read more about it in the report of ThreatVector). The format that we described is just one of many unusual forms that their implants can take.


Parser for the described format:
Presentation from the SAS conference: