While working on some wrapper scripts for dumping OLE VBA macros and attempting to deobfuscate them in search of downloader links, I came across an annoying, but not new, edge case – VBA macros using Excel cells to store additional code. In the past I used Philippe Lagadec’s excellent ViperMonkey. But I had recently rebuilt my development environment and had not installed LibreOffice. I decided to take this opportunity to investigate additional, and possibly more lightweight, techniques of dumping Excel cells.
Enter xlrd
The sample I was analyzing (SHA256 – bdfe2847ce26caadddd779b4690763600f67f2f6b95dd69b0f8997fcc0aab84c) was a legacy XLS file and therefore this analysis and accompanying script will use xlrd. Although the author of xlrd has archived the project and encourages the use of openpyxl, openpyxl does not support XLS files. And xlrd supports both XLS and the newer XLSX format.
xlrd provides a very clean and simple interface for extracting cells from an Excel document. Which means that we can create a python tool or a simple function to extract a specific range of cells and then use this extracted data to continue our VBA analysis. This sample contained the following VBA macro:
Figure 1 – VBA Macro
As you can see, a basic deobfuscation function is defined and then called with a range of cells (K110:K118). A system call is also obfuscated within B101 (the columns start at offset 1 for the Cells VBA function). Let’s see if we can use xlrd to extract the data in B101.
Figure 2 – Dumping cells with xlrd
Next, I developed a python tool to dump Excel cells to the terminal or to a CSV file. The tool also exposes a function to import into any script to extract cells for further analysis. Let’s see it in action.
Excellent. And we can see that the deobfuscation function simply extracts every fourth character starting at offset 3. And what is hidden at K110:K118.
Figure 5 – Deobfuscate cells
Living off the land
Well look what we have here, WMIC being used to execute a PowerShell script. And seeing as my development environment is *nix, this concludes our analysis… JK. Microsoft was so impressed with PowerShell that they have installers for many OS flavors, including macOS X and RHEL. I may skip over some of the manual processes, like the trial and error of getting the obfuscated WMIC command to work in pwsh, but I definitely encourage the reader to download this sample and attempt these deobfuscation steps for themselves.
After scanning the PowerShell script, we can determine a few important features: 2 variables (GAB represents a double quote and PJ represents a comma) are used to handle strings and lists, string formatting is heavily used for obfuscation, and additional data is potentially compressed and base64 encoded.
Figure 6 – Obfuscated code
We can replace those variables with their mapped characters to begin cleaning up the code. And we don’t need to escape the quotes so we can remove all backslashes. If we assume that the compressed data is additional PowerShell code then it will be executed by Invoke-Expression (iex). But it was not immediately apparent whether this was the case, until I stumbled across a sneaky obfuscation technique.
( `${pSh`omE}[4]+`${p`shoMe}[34]+\'X\')
${PSHOME} is an environment variable for PowerShell home directory. Offset 4 is I and offset 34 is E for both 32-bit and 64-bit Powershell installs (c:\Windows\SysWOW64\WindowsPowerShell\v1.0\ and c:\Windows\System32\WindowsPowerShell\v1.0\ respectivetly). If we replace this code block with an echo then we can let pwsh deobfuscate this data for us.
Are you surprised? I’m not. The compressed data is another layer of obfuscated code. At least it’s not that different that the first layer. This layer uses a function to decode, decompress, and execute the obfuscated code. The Invoke-Expression is mapped to Ox using Set-Alias (sal).
Figure 9 – And then…
This is becoming annoying. Oh wait, that was their point. At least this layer calls the same function from the previous layer. A simple copy pasta back into pwsh will give us the final layer.
Figure 10 – The bottom of the rabbit hole
I’ll spare you the pain of manually cleaning up that mess. Most VBA macros use PowerShell to hide their download and execute code and the link for that malicious payload is all that my artifact extractor scripts need. A quick scan of this code indicates that it is downloading a payload from a remote resource. The link is mostly constructed from obfuscated and static strings, with the exception of pulling the currently logged in users. Therefore, we can again use pwsh to find that malicious link.
Figure 11 – Malicious link
The email that started it all:
Figure 12 – Original email
And the script mentioned in this post is available on the public CofenseLabs tools repository – https://github.com/CofenseLabs/tools
All third-party trademarks referenced by Cofense whether in logo form, name form or product form, or otherwise, remain the property of their respective holders, and use of these trademarks in no way indicates any relationship between Cofense and the holders of the trademarks.