The Geodo malware is a banking trojan that presents significant challenges. For starters, it conducts financial theft on a vast scale and enables other financially driven trojans. Also known as Emotet, Geodo has a rich history, with five distinct variants, three of which are currently active according to Feodo Tracker. Geodo’s lineage is incredibly convoluted and intertwined with malware such as Cridex as well as the later iterations known as Dridex.
This blog is the first of a CofenseTM three-part series on Geodo. Our analysis of Geodo focuses not on code analysis, rather on observed behaviours, infrastructure choices and proliferation. We note there has been an upward trend of education and government-based mail account credentials being compromised and used to further distribute Geodo. Further, we investigate message content and its focus on financial themes and narratives.
Future blogs will dive into the technical details of the URL structures prevalent in Geodo campaigns and will feature an in-depth analysis and deobfuscation techniques for the multi-layered macro code found within these documents.
Geodo has been steadily building momentum during 2018; after a quiet first quarter, campaigns involving Geodo have increased significantly both in frequency and density. Cofense Intelligence™ is seeing more consecutive days of campaigns, as well as more campaigns per day. Chart 1 details the year-to-date trends of Geodo as tracked by Cofense Intelligence.
Chart 1: The yearly trends of campaigns involving Geodo or its derivatives.
A very recent change in Geodo’s behavior has seen the banking Trojan move away from its stealer roots and move towards the loader space. Recent campaigns have seen Geodo conditionally deliver either TrickBot or Zeus Panda, both of which would be considered competitors to Geodo’s banking functionality. The actors behind Geodo had been testing the water of competitor delivery as far back as March 26th, 2018, where a campaign delivering Geodo via weaponised Microsoft Office documents led to a further infection of Zeus Panda (See TID 11199). The authors of banking trojans are continually pushed to combat and overcome evolving financial security measures, such as Multi-Factor Authentication (MFA) and software-based security solutions. This arms-race could well be a motivator for the actors behind Geodo’s distribution having moved to the long-term revenue strategy of leasing out their botnet as a loader platform.
Geodo overwhelmingly favours an infection chain of:
Malicious URL → Downloaded Office Document → Macro → Geodo.
Geodo heavily favours both package delivery notices and financial institution-themed campaigns. Figure 1 is a world cloud based upon all Geodo campaigns observed by Cofense IntelligenceTM since tracking began. Figure 2 details campaigns observed strictly in 2018.
Figure 1: A word cloud generated from subject lines captured since tracking began.
Figure 2: Geodo campaign subject lines identified via tracking and botnet injection
Over time, Geodo has expanded from a propensity towards delivery-themed campaigns (spoofing companies like DHL, FedEx, and UPS), to Banking and financial narratives. However, this new focus does not preclude the tendency to spoof legitimate institutions, such as Bank of America and Chase bank. Chart 2 details the breakdown of campaigns throughout 2018, by brand.
Chart 2: A breakdown of brands being spoofed by Geodo in 2018. Note: Generic Malware Threat is assigned to campaigns that do not imitate a legitimate entity or organisation. Note: the redacted entry is a large banking entity.
Geodo is a self-perpetuating bot. Once running on a machine, it actively begins to spam copies of itself to a victim list retrieved during one of its many check-ins to a plethora of C2 nodes, as well as addresses harvested directly from local contact lists. Typically, the messages sent by an infected host will contain either a URL from which a potential victim can download a weaponised Office document, or it will have that type of document attached directly to the message.
Geodo uses a subtle marker to track which bots are delivering messages on behalf of the actor(s) behind the campaigns. The Message-ID field of each message contains an identifier which can potentially be used to identify which bot sent a particular message. At this moment, the structure of the message ID is:
<20 numeric characters>.<16 hex characters>@<recipient domain>
A more literal example could be:
The identifier is a unique number assigned to each message as it is generated and sent by a bot. We have observed the identifier change as a bot progresses through its assigned list of recipients, then subsequent campaigns, as the bot becomes active again. Despite not changing linearly or sequentially, the general trend of these identifiers has seen the character count increase from 15 to 19-20.
There are several key pieces of data that can guide us toward some likely reasons for this behavior:
- The identifier ranges are not unique to each bot – multiple bots can have overlap within the same range.
- Identifiers do not always increment sequentially. This is true across multiple bots.
- Since tracking began, the identifier size has risen consistently from 15 bytes, up to 19-20 bytes.
- There are never any identifier collisions, even across different infections.
These four points lend credence to the supposition that these identifiers not only serve the functional purpose of a Message-ID (to act as a globally unique identifier of a message), but also allow the actors behind Geodo to track which bot is sending which message. By seeding recipient lists with attacker-controlled email addresses, it is possible to programmatically identify which bots are not sending messages as expected, and could be compromised, offline or otherwise in an undesirable state. With this information, attackers may be able to figure out which bots are legitimate infections, and which are researcher-controlled, thus giving them the capability to selectively send bogus templates or data to these compromised nodes.
The second part of the Message-ID structure is a 16 character hex string. As with the identifier, each hex string is unique to the message, meaning it is most likely a hash of some kind.
The final part of the Message-ID is simply the recipient’s own domain.
URL-based campaigns – that is, campaigns that deliver messages which contain URLs to download weaponised Office documents – are by far the most prevalent payload mechanism employed by Geodo. Indeed, analysis of ~612K messages shows just 7300 have attachments; a trifling 1.2% of the total. The structure of the URLs falls into two distinct classes. Cofense Intelligence analysed a corpus of 90,000 URLs and identified 165 unique URL paths.
There are two distinct classes of URLs employed by Geodo. A detailed breakdown of these URL structures will be discussed in an upcoming blog.
Chart 3: A breakdown of the top 10 URL tokens extracted from the 1000 most recently observed URLs.
A typical email from a URL-based campaign can be seen in Figure 3. Heavily contrasting TrickBot’s focus on social engineering, Geodo campaigns are fairly often lacking in any genuine attempt at brand imitation, beyond merely stating a name and perhaps a disclaimer.
Figure 3: An example of a Geodo email delivering a URL.
Figure 4 details the type of network activity that might occur, should a victim click on a link in one of these messages. When clicked, the user’s default browser is opened, and the download occurs directly. In the case of Google Chrome, the user typically will receive multiple warnings that the file being downloaded is hostile and requires multiple steps to allow the download to finish. Figures 5 and 6 details this process.
Figure 4: A Wireshark capture of the HTTP conversation after a live link is clicked.
Figure 5: A warning bar at the base of the Google Chrome browser warns the user the file is dangerous.
Figure 6: The user is required to click “Keep Dangerous File” followed by “Keep anyway” before Chrome will release the quarantined file.
Despite Chrome doing an admirable job of identifying some of the malicious documents, the permutations employed by the Geodo actors allows a significant number of documents to pass by unnoticed. Further stymying the malicious actors’ efforts: the downloaded documents are tagged with a “MotW” — or “Mark of the Web” – which, as seen in Figure 7, can potentially require further engagement by the recipient to finally get the file opened. A ZoneID of 3 indicates that the file is from the Internet Zone.
Figure 7: The downloaded documents are tagged with a Mark of the Web.
Although comparatively rare, Geodo campaigns occasionally deliver attachments instead of malicious URLs, but the narratives and themes used for these campaigns do not noticeably differ. Figure 8 shows an example of a message from an attachment-based campaign. This campaign used a generic theme with no identifiable company or entity being imitated.
Figure 8: An example message from an attachment-based, Geodo campaign.
Digging into a corpus of ~7500 filenames (examples of which are presented in Table 1) shows a very distinct set of naming conventions. These can mostly be described by a regular expression, with a few caveats.
Table 1: Example filenames used during very recent Geodo campaigns.
The naming structure bears very close resemblance to certain segments of URLs, described in detail in the next blog in this series. Although drawing any conclusions from this would be fallacious, it could potentially be used to predict the structure of a successor campaign.
Weaponised Office Documents
Regardless of which vehicle was used as the transport medium, the documents are invariably, intuitively similar. Each document comes weaponised with a hostile macro. The macros are always heavily obfuscated, with junk functions and string substitutions prevalent throughout the code. The obfuscation uses three languages or dialects as part of the obfuscation process: Visual Basic, PowerShell, and Batch.
An upcoming blog will provide an in-depth analysis of the deobfuscation techniques for the multi-layered macro code found within these documents.
The general behaviour of Geodo has been covered in extreme depth both by Cofense and the greater InfoSec community, so we will not rehash those analyses here. Rather, we will focus on Geodo’s ubiquitous spamming capabilities and the methods it uses to facilitate such behaviour.
Geodo is a modular trojan, which means most of its functionality is abstracted away from the core code and placed in external files that can be selectively imported and executed. One such example is the “spam” module. This module facilitates not only the distribution of spam, but also the validation of stolen credentials.
Geodo has two primary means of obtaining credentials. One way is retrieving a list along with the spam module. The other harvests accounts from the local machine, using a variety of external utilities. When new accounts are discovered, their credentials are validated before any attempt is made to communicate them. Figure 9 shows the credential validation phase of the spam process.
Figure 9: The credential validation phase. Each set of credentials is validated before it is used to send spam messages.
If a set of credentials is validated, spam messaging begins in earnest. Figures 10 and 11 show a Wireshark capture of a bot testing credentials before delivering messages to multiple recipients. These recipients are chosen from a large pool of email addresses containing hundreds of thousands, perhaps millions of addresses. It is unlikely that any bot ever receives a complete list of recipient addresses, meaning the sheer number available to Geodo is staggering.
Figure 10: A Wireshark capture of Geodo testing a set of credentials, before using them to authenticate and begin sending the current template.
Figure 11: Geodo iterates through its recipient list and continues to send phishing messages, using the same session.
Geodo is in constant contact with its C2 hosts. Geodo comes hardcoded with anywhere from 30-45 IP addresses, each pointing to a compromised (or, in some cases, outright malicious) web server. Most of these use Nginx as a reverse proxy to forward connections onto the actual command and control hosts. Figure 12 shows an approximate interpretation of this infrastructure.
Figure 12: An approximate representation of the Geodo infrastructure. It should be noted that there’s a high chance the proxies are tiered or layered; this representation defines a single-layer proxy configuration.
As part of its communications with the C2 infrastructure, Geodo is constantly polling for updates, commands, or instructions. Threat actors behind Geodo frequently deploy new email templates, updated C2 lists, and other module specific instructions or data. In the case of the spam module, we have actively observed Geodo launching spam campaigns against yet unseen victims in addition to new, stolen credentials. This type of information exchange is very unlikely to be unidirectional. To keep the recipient and credential lists fresh and relevant, Geodo must communicate dead recipients, bad credentials, or bad hosts. Geodo has also been directly observed updating passwords for usernames as they become available. This type of information exchange allows the Geodo actors to automatically adjust their lists in as near real-time as is feasible, but it does open the botnet up to vulnerabilities.
It is plausible that researchers could poison the entire botnet from just a few hosts. Researchers could monitor the credentials being used by each bot, then create an account on the infected device that matches the username but contains a bad password. When the bot attempts to verify the authenticity of the new password and connects to a researcher-controlled SMTP host to accomplish this, the researcher’s host responds that authentication has been successful. Geodo will not only go ahead and begin spamming out phishing emails (as demonstrated in Figures 10 and 11), but it will also report the updated credentials to the C2 infrastructure. These bad credentials will propagate throughout the botnet and, potentially, cause large scale interruptions to its activity.
At the time of analysis, Cofense has tracked ~31,000 credential sets in a very short time. Charts 4 through 6 show multiple interpretations and permutations of this data.
Chart 4: Compromised credentials by Top-Level-Domain (TLD).
Chart 5: Compromised Credentials by Second-Level-Domain (SLD).
Chart 6: Compromised credentials by domain.
Beyond being interesting purely as data points, tracking the domains to which the compromised credentials belong allows us to actively see where outbreaks are succeeding. Spikes for certain TLDs (such as .edu) might indicate the actors are targeting students and educators. A rise in occurrences of .gov.uk SLDs (Second-Level Domains) could indicate the targeting of UK-based government agencies.
For many reasons, Geodo is a hugely problematic trojan. Its primary distribution method contributes an enormous amount of daily spam and phishing volumes. Not only does it engage in financial theft, but also enables additional finance-driven trojans. It can spread laterally across a network and steal credentials from a large array of software – further perpetuating the spam problem. Staying on top of these threats means employing timely, pertinent, and high-fidelity training to help users become familiar with this prolific threat. Security in depth means the ability to know not only “what”, but also “who.”
For a look behind and a look ahead at major malware trends, view the 2018 Cofense Malware Review.
All third-party trademarks referenced by Cofense whether in logo form, name form or product form, or otherwise, remain the property of their respective holders, and use of these trademarks in no way indicates any relationship between Cofense and the holders of the trademarks.