Is .XLSX Phishing Making a Comeback?

April 6, 2018

On March 22nd, Cofense came across a rather unique malware sample that had a very low detection rate. At the time of analysis, the file was only detected by 5/61 AV engines. The detection rate did not reach 30% until at least a week later, as per VirusTotal: 38015eb1699b7596e8c95fed7f0bc32d1492b371bd4d7953019f69dcf40ff1fd.

What caught our attention wasn’t the phishing email (Figure 1) or a new exploit, but an old familiar attachment type making a comeback. RTF files are a dime a dozen these days, but this one brings back a familiar face, .xlsx, and demonstrates a unique way to exploit CVE-2017-11882 . In this blog post we review our analysis and present key findings for this unique malicious downloader.

Figure 1 – The Phishing Email

The RTF Downloader

Before analyzing this new sample, let’s review the structure and attack vector of a traditional RTF exploit document. The Proof of Concept (PoC) and most of the samples seen in the wild contain an embedded object with an Equation Native stream.

Figure 2 – Layout and Hexadecimal Dump of the Embedded Equation Object.

The embedded Equation object header mostly conforms to the standard with the exception that some of the reserved fields are populated (these exceptions are highlighted in Figure 3).

Field	Bytes	Description
nCBHdr	2	Header Length (0x001C)
nVersion	4	Version (0x0000 0x0002)
nCf	2	Clipboard Format
nCBObject	4	MTEF data length
nReserved1	4	Unused
nReserved2	4	Unused
nReserved3	4	Unused
nReserved4	4	Unused

Figure 3 – The Equation Object Header and Default Values

Following the Equation object header, we see a normal MathType Equation Format (MTEF) header followed by a SIZE (0x0A) and LINE (0x01) record and finally the contents of the LINE record. This again follows the standard overall structure of the Equation 3.X binary format.

Field	Bytes	Value (Equation Editor 3.x)
MTEF Version	1	0x03
Generating Platform	1	0x01
Generating Product	1	0x01
Product Version	1	0x03
Product Subversion	1	Dependent on subversion

Figure 4 –MTEF Binary Format Fields

The contents of the LINE record are where the exploit happens. The first record defined after the LINE record is the vulnerable FONT record, which utilizes a function with a buffer overflow vulnerability. Most of the exploit documents seen in the wild overflow the font name with a simple command to download and execute their malicious payload and then overwrite the return address with the address of a WinExec API call (0x00430C12), using WinExec to execute their command. Figure 5 is a hexadecimal view of the PoC, which simply opens calc.exe, and demonstrates that the commands are clearly viewable when examining the document.

Figure 5 – Hexadecimal of the RTF PoC FONT Record

The New XLSX Downloader

This is where things change. This XLSX sample differs from the RTF example above in a number of ways. One way noticed during analysis was the embedded object; it contains an OLENativeStream instead of the commonly used Equation native stream.

Figure 6 – Layout and Hexadecimal Dump of the Embedded OLENativeStream Object

The OLENativeStream is an OLE2.0 stream object contained within an OLE Compound File Storage (MS-CFB) object and contains only one header field, a 4-byte NativeDataSize field. At first glance, the data after this field does not follow the standard MTEF format. It is also not obvious how Equation Editor (EQNEDT32.EXE) is being called. To solve this mystery, we need to better understand the MS-CFB format.

Since this is not an analysis of MS-CFB objects, we will only provide highlights and supply details for the relevant section. MS-CFB objects contain a header and at least 4 sectors – File Allocation Table (FAT) sector, Directory sector, MiniFAT sector, and the Mini Stream sector. The object header is easily identifiable by its unique header signature – 0xD0CF11E0 0xA1B11AE1. The first sector, following the header, is the FAT sector and mirrors any standard file system table by defining multiple sector chains with linked lists of sectors in each chain. The next sector is the Directory sector and it provides details about the stream and storage objects contained within the MS-CFB object. It is identifiable by the root entry name of “Root Entry” in UTF-16 format. The MiniFAT sector is simply an allocation table for the data in the mini stream; and the MiniStream contains the user-defined data.

Figure 7 – Hexadecimal Dump of MS-CFB Object Layout

The Directory sector is where the details about the embedded stream object are defined, a native stream in the case of this sample. The Directory sector contains four entries – Root Directory Entry, Storage #1, Stream #1, and an unused entry. Most of these entries have little value for our analysis. The stream entry does define the length of the stream data, which will be helpful when analyzing the attack vector. The storage entry also confirms that this is an OLENativeStream by examining the Entry Name field. But it’s in the root entry where our first mystery is revealed… it contains the Class Identifier (CLSID) field. This field is used for Component Object Model (COM) activation of the embedded document’s application. The CLSID for this sample was {0002CE02-0000-0000-C000-000000000046}, which referred to C:Program FilesCommon FilesMicrosoft SharedEQUATIONEQNEDT32[.]EXE on our analysis system. Now we know how EQNEDT32.EXE is called to process the MTEF data. (OK, this might not excite you as much as it does us, but we love phishing research!!!)

Byte Offset	Field Names
0x400	Directory Entry Name
0x440	Directory Entry Name Length
0x442	Object Type
0x443	Color Flag
0x444	Left Sibling ID
0x448	Right Sibling ID
0x44C	Child ID
0x450	CLSID
0x460	State Bits
0x464	Create Time
0x46C	Modification Time
0x474	Starting Sector Location
0x478	Stream Size

Figure 8 – Root Directory Header

The Native Data

Just how different is the native data in this sample from a standard MTEF binary data? A quick comparison of the first several bytes, skipping the 4-byte NativeDataSize field, suggests that only the MTEF version and product version fields are required for EQNEDT32.EXE to successfully open the binary equation data. This sample exploits the same font name buffer overflow within the FONT record but uses a different typeface number and style for the FONT record. The original PoC used 0x58 for both values, whereas this samples uses 0xF1 and 0x41 respectively.

And finally, traditional RTF exploit documents seen in the past inserted their commands in clear text and directly overwrote the return address with a WinExec API call. However, this sample is more advanced. First, it overwrites the return address with an address of a RET instruction so that it can jump back to the beginning of the record data. Next, it locates the entire record data on the heap. The Font Name is null terminated and the new return address has a null byte, so the stack overflow does not include the entire record. Then, it jumps into the middle of the user-controlled data. Most of the user-controlled data is junk bytes to make static analysis difficult. Finally, it starts decoding and executing the second stage shellcode used to download and launch the final malicious payload.

The Shellcode

The decoder stub contains quite a bit of junk code, as seen in Figure 9. We also see the downloader URL beginning to form in the Hex dump. The decoder is summarized in Figure 10.

Figure 9 – Debugger View of Decoder Stub

CALL 0x5

POP EBP

ADD EBP, 129

LEA EDI,

IMUL ECX,ECX,0

decoder:

IMUL ECX,ECX,6E3A6F9B

ADD ECX,0EFF7D5

XOR DWORD PTR SS:,ECX

ADD EBP,4

CMP EBP,EDI

JB decoder

LEA ECX,EDI-3A5

CALL 0x121

Figure 10 – Decoder Stub

Once the decoding is complete, we see the downloader URL, the saved file path, and the Windows API calls used by the second stage shellcode. The second stage shellcode performs the following steps to download and execute the final payload:

Determine where kernel32.dll is loaded in memory
Enumerate until LoadLibraryA is located
Enumerate until GetProcAddress is located
Use GetProcAddress to locate ExpandEnvironmentStringW
Use LoadLibrary to load urlmon.dll
Use GetProcAddress to locate URLDownloadToFileW in urlmon.dll
Call URLDownloadToFileW with URL and destination file path
Use GetProcAddress to locate and call WideCharToMultiByte
Use GetProcAddress to locate and call WinExec to execute the downloaded payload

Conclusion

These new updates demonstrate that the adversaries are aware of defender’s security controls and are consistently improving their tools to bypass detection. The shift to an OLENativeStream makes detection more difficult as many benign documents use this OLE object. The junk data in the MTEF header and FONT record adds difficulties for a static signature. Lastly, the use of encoded shellcode obfuscates the downloader Uniform Resource Locator (URL) and makes automated extraction of any Indicator of Compromise (IoC) more difficult.

We are happy to say that there are several existing detection rules in place for Cofense Triage™ customers for this specific type of threat. Additionally, two new Cofense Triage detection rules have been created and deployed to all existing Cofense Triage customers.

Existing Cofense Triage detection rules:

Office_Embedded_oleObject
PM_Office_Embedded_Object

Newly created Cofense Triage detection rules:

PM_EqnEd32_Shellcode_OLE_Doc
PM_EqnEd32_Shellcode_RTF_Doc

Included below are two Yara rules that could be used for potential threat detection.

rule Cofense_CVE-2017-11882_office_doc {

meta:

disclaimer = “Cofense will not be liable for any damages of any nature or kind, directly or indirectly, resulting from usage of this rule.”

date = “04/04/2018”

version = “1.0”

strings:

$docfile={D0 CF 11 E0}

$eqn_clsid={02 CE 02 00 00 00 00 00 C0 00 00 00 00 00 00 46}

$root_entry=”Root Entry” nocase wide

$jmp_eax={FF E0}

condition:

$docfile and $eqn_clsid and $root_entry and ($jmp_eax in (0x82C..0x838))

}

rule Cofense_CVE-2017-11882_rtf {

meta:

disclaimer = “Cofense will not be liable for any damages of any nature or kind, directly or indirectly, resulting from usage of this rule.”

date = “04/04/2018”

version = “1.0”

strings:

$docfile=”d0cf11e” nocase

$eqn_clsid=”02ce020000000000c000000000000046″ nocase

$rtf_obj=”objdata” nocase

$rtf_header=”{\rt” nocase

$jmp_eax=”ffe0″ nocase

condition:

$docfile and $eqn_clsid and $rtf_obj and $jmp_eax and ($rtf_header at 0)

}

Learn more about Cofense Triage, our automated incident response and phishing defense platform.

References

https://portal.msrc.microsoft.com/en-US/security-guidance/advisory/CVE-2017-11882

https://github.com/embedi/CVE-2017-11882

https://docs.libreoffice.org/starmath/html/eqnolefilehdr_8hxx_source.html

http://rtf2latex2e.sourceforge.net/MTEF3.html

Is .XLSX Phishing Making a Comeback?

Share This Article

Read More Related Phishing Blog Posts

Agent Tesla: The Punches Keep Coming

Recently Updated Rhadamanthys Stealer Delivered in Federal Bureau of Transportation Campaign

Midnight Blizzard APT Group’s Attack on Microsoft and What It Means for Email Security

See Cofense in action.

You'll learn how to:

Contact Us

Why Cofense

Resources

Company Info

Search