On March 22nd, Cofense came across a rather unique malware sample that had a very low detection rate. At the time of analysis, the file was only detected by 5/61 AV engines. The detection rate did not reach 30% until at least a week later, as per VirusTotal: 38015eb1699b7596e8c95fed7f0bc32d1492b371bd4d7953019f69dcf40ff1fd.
What caught our attention wasn’t the phishing email (Figure 1) or a new exploit, but an old familiar attachment type making a comeback. RTF files are a dime a dozen these days, but this one brings back a familiar face, .xlsx, and demonstrates a unique way to exploit CVE-2017-11882 . In this blog post we review our analysis and present key findings for this unique malicious downloader.
Figure 1 – The Phishing Email
The RTF Downloader
Before analyzing this new sample, let’s review the structure and attack vector of a traditional RTF exploit document. The Proof of Concept (PoC) and most of the samples seen in the wild contain an embedded object with an Equation Native stream.
Figure 2 – Layout and Hexadecimal Dump of the Embedded Equation Object.
The embedded Equation object header mostly conforms to the standard with the exception that some of the reserved fields are populated (these exceptions are highlighted in Figure 3).
Field | Bytes | Description |
nCBHdr | 2 | Header Length (0x001C) |
nVersion | 4 | Version (0x0000 0x0002) |
nCf | 2 | Clipboard Format |
nCBObject | 4 | MTEF data length |
nReserved1 | 4 | Unused |
nReserved2 | 4 | Unused |
nReserved3 | 4 | Unused |
nReserved4 | 4 | Unused |
Figure 3 – The Equation Object Header and Default Values
Following the Equation object header, we see a normal MathType Equation Format (MTEF) header followed by a SIZE (0x0A) and LINE (0x01) record and finally the contents of the LINE record. This again follows the standard overall structure of the Equation 3.X binary format.
Field | Bytes | Value (Equation Editor 3.x) |
MTEF Version | 1 | 0x03 |
Generating Platform | 1 | 0x01 |
Generating Product | 1 | 0x01 |
Product Version | 1 | 0x03 |
Product Subversion | 1 | Dependent on subversion |
Figure 4 –MTEF Binary Format Fields
The contents of the LINE record are where the exploit happens. The first record defined after the LINE record is the vulnerable FONT record, which utilizes a function with a buffer overflow vulnerability. Most of the exploit documents seen in the wild overflow the font name with a simple command to download and execute their malicious payload and then overwrite the return address with the address of a WinExec API call (0x00430C12), using WinExec to execute their command. Figure 5 is a hexadecimal view of the PoC, which simply opens calc.exe, and demonstrates that the commands are clearly viewable when examining the document.
Figure 5 – Hexadecimal of the RTF PoC FONT Record
The New XLSX Downloader
This is where things change. This XLSX sample differs from the RTF example above in a number of ways. One way noticed during analysis was the embedded object; it contains an OLENativeStream instead of the commonly used Equation native stream.
Figure 6 – Layout and Hexadecimal Dump of the Embedded OLENativeStream Object
The OLENativeStream is an OLE2.0 stream object contained within an OLE Compound File Storage (MS-CFB) object and contains only one header field, a 4-byte NativeDataSize field. At first glance, the data after this field does not follow the standard MTEF format. It is also not obvious how Equation Editor (EQNEDT32.EXE) is being called. To solve this mystery, we need to better understand the MS-CFB format.
Since this is not an analysis of MS-CFB objects, we will only provide highlights and supply details for the relevant section. MS-CFB objects contain a header and at least 4 sectors – File Allocation Table (FAT) sector, Directory sector, MiniFAT sector, and the Mini Stream sector. The object header is easily identifiable by its unique header signature – 0xD0CF11E0 0xA1B11AE1. The first sector, following the header, is the FAT sector and mirrors any standard file system table by defining multiple sector chains with linked lists of sectors in each chain. The next sector is the Directory sector and it provides details about the stream and storage objects contained within the MS-CFB object. It is identifiable by the root entry name of “Root Entry” in UTF-16 format. The MiniFAT sector is simply an allocation table for the data in the mini stream; and the MiniStream contains the user-defined data.
Figure 7 – Hexadecimal Dump of MS-CFB Object Layout
The Directory sector is where the details about the embedded stream object are defined, a native stream in the case of this sample. The Directory sector contains four entries – Root Directory Entry, Storage #1, Stream #1, and an unused entry. Most of these entries have little value for our analysis. The stream entry does define the length of the stream data, which will be helpful when analyzing the attack vector. The storage entry also confirms that this is an OLENativeStream by examining the Entry Name field. But it’s in the root entry where our first mystery is revealed… it contains the Class Identifier (CLSID) field. This field is used for Component Object Model (COM) activation of the embedded document’s application. The CLSID for this sample was {0002CE02-0000-0000-C000-000000000046}, which referred to C:Program FilesCommon FilesMicrosoft SharedEQUATIONEQNEDT32[.]EXE on our analysis system. Now we know how EQNEDT32.EXE is called to process the MTEF data. (OK, this might not excite you as much as it does us, but we love phishing research!!!)
Byte Offset | Field Names |
0x400 | Directory Entry Name |
0x440 | Directory Entry Name Length |
0x442 | Object Type |
0x443 | Color Flag |
0x444 | Left Sibling ID |
0x448 | Right Sibling ID |
0x44C | Child ID |
0x450 | CLSID |
0x460 | State Bits |
0x464 | Create Time |
0x46C | Modification Time |
0x474 | Starting Sector Location |
0x478 | Stream Size |
Figure 8 – Root Directory Header
The Native Data
Just how different is the native data in this sample from a standard MTEF binary data? A quick comparison of the first several bytes, skipping the 4-byte NativeDataSize field, suggests that only the MTEF version and product version fields are required for EQNEDT32.EXE to successfully open the binary equation data. This sample exploits the same font name buffer overflow within the FONT record but uses a different typeface number and style for the FONT record. The original PoC used 0x58 for both values, whereas this samples uses 0xF1 and 0x41 respectively.
And finally, traditional RTF exploit documents seen in the past inserted their commands in clear text and directly overwrote the return address with a WinExec API call. However, this sample is more advanced. First, it overwrites the return address with an address of a RET instruction so that it can jump back to the beginning of the record data. Next, it locates the entire record data on the heap. The Font Name is null terminated and the new return address has a null byte, so the stack overflow does not include the entire record. Then, it jumps into the middle of the user-controlled data. Most of the user-controlled data is junk bytes to make static analysis difficult. Finally, it starts decoding and executing the second stage shellcode used to download and launch the final malicious payload.
The Shellcode
The decoder stub contains quite a bit of junk code, as seen in Figure 9. We also see the downloader URL beginning to form in the Hex dump. The decoder is summarized in Figure 10.
Figure 9 – Debugger View of Decoder Stub
CALL 0x5
POP EBP ADD EBP, 129 LEA EDI, IMUL ECX,ECX,0 decoder: IMUL ECX,ECX,6E3A6F9B ADD ECX,0EFF7D5 XOR DWORD PTR SS:,ECX ADD EBP,4 CMP EBP,EDI JB decoder LEA ECX,EDI-3A5 CALL 0x121 |
Figure 10 – Decoder Stub
Once the decoding is complete, we see the downloader URL, the saved file path, and the Windows API calls used by the second stage shellcode. The second stage shellcode performs the following steps to download and execute the final payload:
- Determine where kernel32.dll is loaded in memory
- Enumerate until LoadLibraryA is located
- Enumerate until GetProcAddress is located
- Use GetProcAddress to locate ExpandEnvironmentStringW
- Use LoadLibrary to load urlmon.dll
- Use GetProcAddress to locate URLDownloadToFileW in urlmon.dll
- Call URLDownloadToFileW with URL and destination file path
- Use GetProcAddress to locate and call WideCharToMultiByte
- Use GetProcAddress to locate and call WinExec to execute the downloaded payload
Conclusion
These new updates demonstrate that the adversaries are aware of defender’s security controls and are consistently improving their tools to bypass detection. The shift to an OLENativeStream makes detection more difficult as many benign documents use this OLE object. The junk data in the MTEF header and FONT record adds difficulties for a static signature. Lastly, the use of encoded shellcode obfuscates the downloader Uniform Resource Locator (URL) and makes automated extraction of any Indicator of Compromise (IoC) more difficult.
We are happy to say that there are several existing detection rules in place for Cofense Triage™ customers for this specific type of threat. Additionally, two new Cofense Triage detection rules have been created and deployed to all existing Cofense Triage customers.
Existing Cofense Triage detection rules:
- Office_Embedded_oleObject
- PM_Office_Embedded_Object
Newly created Cofense Triage detection rules:
- PM_EqnEd32_Shellcode_OLE_Doc
- PM_EqnEd32_Shellcode_RTF_Doc
Included below are two Yara rules that could be used for potential threat detection.
rule Cofense_CVE-2017-11882_office_doc {
meta: copyright = “Copyright © Cofense Inc. 2018″ disclaimer = “Cofense will not be liable for any damages of any nature or kind, directly or indirectly, resulting from usage of this rule.” date = “04/04/2018” version = “1.0” strings: $docfile={D0 CF 11 E0} $eqn_clsid={02 CE 02 00 00 00 00 00 C0 00 00 00 00 00 00 46} $root_entry=”Root Entry” nocase wide $jmp_eax={FF E0} condition: $docfile and $eqn_clsid and $root_entry and ($jmp_eax in (0x82C..0x838)) } rule Cofense_CVE-2017-11882_rtf { meta: copyright = “Copyright © Cofense Inc. 2018″ disclaimer = “Cofense will not be liable for any damages of any nature or kind, directly or indirectly, resulting from usage of this rule.” date = “04/04/2018” version = “1.0” strings: $docfile=”d0cf11e” nocase $eqn_clsid=”02ce020000000000c000000000000046″ nocase $rtf_obj=”objdata” nocase $rtf_header=”{\rt” nocase $jmp_eax=”ffe0″ nocase condition: $docfile and $eqn_clsid and $rtf_obj and $jmp_eax and ($rtf_header at 0) } |
Learn more about Cofense Triage, our automated incident response and phishing defense platform.
References
https://portal.msrc.microsoft.com/en-US/security-guidance/advisory/CVE-2017-11882
https://github.com/embedi/CVE-2017-11882
https://docs.libreoffice.org/starmath/html/eqnolefilehdr_8hxx_source.html
http://rtf2latex2e.sourceforge.net/MTEF3.html