Cofense Logo - Email Security Solutions

Nothing to see here

Share Now


While working on some wrapper scripts for dumping OLE VBA macros and attempting to deobfuscate them in search of downloader links, I came across an annoying, but not new, edge case – VBA macros using Excel cells to store additional code. In the past I used Philippe Lagadec’s excellent ViperMonkey. But I had recently rebuilt my development environment and had not installed LibreOffice. I decided to take this opportunity to investigate additional, and possibly more lightweight, techniques of dumping Excel cells.

Enter xlrd

The sample I was analyzing (SHA256 – bdfe2847ce26caadddd779b4690763600f67f2f6b95dd69b0f8997fcc0aab84c) was a legacy XLS file and therefore this analysis and accompanying script will use xlrd. Although the author of xlrd has archived the project and encourages the use of openpyxl, openpyxl does not support XLS files. And xlrd supports both XLS and the newer XLSX format.

xlrd provides a very clean and simple interface for extracting cells from an Excel document. Which means that we can create a python tool or a simple function to extract a specific range of cells and then use this extracted data to continue our VBA analysis. This sample contained the following VBA macro:

 VBA Macro

Figure 1 – VBA Macro

As you can see, a basic deobfuscation function is defined and then called with a range of cells (K110:K118). A system call is also obfuscated within B101 (the columns start at offset 1 for the Cells VBA function). Let’s see if we can use xlrd to extract the data in B101.

Dumping cells with xlrd

Figure 2 – Dumping cells with xlrd

Next, I developed a python tool to dump Excel cells to the terminal or to a CSV file. The tool also exposes a function to import into any script to extract cells for further analysis. Let’s see it in action.
Figure 3 –
Figure 4 – get_cells()

Excellent. And we can see that the deobfuscation function simply extracts every fourth character starting at offset 3. And what is hidden at K110:K118.

Deobfuscate cells

Figure 5 – Deobfuscate cells

Living off the land

Well look what we have here, WMIC being used to execute a PowerShell script. And seeing as my development environment is *nix, this concludes our analysis… JK. Microsoft was so impressed with PowerShell that they have installers for many OS flavors, including macOS X and RHEL. I may skip over some of the manual processes, like the trial and error of getting the obfuscated WMIC command to work in pwsh, but I definitely encourage the reader to download this sample and attempt these deobfuscation steps for themselves.

After scanning the PowerShell script, we can determine a few important features: 2 variables (GAB represents a double quote and PJ represents a comma) are used to handle strings and lists, string formatting is heavily used for obfuscation, and additional data is potentially compressed and base64 encoded.

Obfuscated code

Figure 6 – Obfuscated code

We can replace those variables with their mapped characters to begin cleaning up the code. And we don’t need to escape the quotes so we can remove all backslashes. If we assume that the compressed data is additional PowerShell code then it will be executed by Invoke-Expression (iex). But it was not immediately apparent whether this was the case, until I stumbled across a sneaky obfuscation technique.

					( `${pSh`omE}[4]+`${p`shoMe}[34]+\'X\')

${PSHOME} is an environment variable for PowerShell home directory. Offset 4 is I and offset 34 is E for both 32-bit and 64-bit Powershell installs (c:\Windows\SysWOW64\WindowsPowerShell\v1.0\ and c:\Windows\System32\WindowsPowerShell\v1.0\ respectivetly). If we replace this code block with an echo then we can let pwsh deobfuscate this data for us.

Deobfuscating with echo
Figure 7 – Deobfuscating with echo
and then
Figure 8 – And then

Are you surprised? I’m not. The compressed data is another layer of obfuscated code. At least it’s not that different that the first layer. This layer uses a function to decode, decompress, and execute the obfuscated code. The Invoke-Expression is mapped to Ox using Set-Alias (sal).

And then…

Figure 9 – And then…

This is becoming annoying. Oh wait, that was their point. At least this layer calls the same function from the previous layer. A simple copy pasta back into pwsh will give us the final layer.

The bottom of the rabbit hole

Figure 10 – The bottom of the rabbit hole

I’ll spare you the pain of manually cleaning up that mess. Most VBA macros use PowerShell to hide their download and execute code and the link for that malicious payload is all that my artifact extractor scripts need. A quick scan of this code indicates that it is downloading a payload from a remote resource. The link is mostly constructed from obfuscated and static strings, with the exception of pulling the currently logged in users. Therefore, we can again use pwsh to find that malicious link.

Malicious link

Figure 11 – Malicious link

The email that started it all:

Original email

Figure 12 – Original email

All third-party trademarks referenced by Cofense whether in logo form, name form or product form, or otherwise, remain the property of their respective holders, and use of these trademarks in no way indicates any relationship between Cofense and the holders of the trademarks.  


We use our own and third-party cookies to enhance your experience by showing you relevant content, personalizing our communications with you, and remembering your preferences when you visit our website. We also use them to improve the overall performance of our site. You can learn more about the cookies and similar technology we use by viewing our privacy policy. By clicking ‘Accept,’ you acknowledge and consent to our use of all cookies on our website.

This site is registered on as a development site.