Bolek: Leaked Carberp KBot Source Code Complicit in New Phishing Campaigns
Reuse of infrastructure supporting malware distribution is a well-documented characteristic of online crime and a key way to track and classify threat actors. While it may seem simplistic for monitoring threat actor activities, the IP addresses, domains, hostnames, and URLs contacted by malware tools betray a significant amount of information about threat actor groups. For some malware attacks, it’s possible to determine the threat actor’s identity based on the infrastructure used, but, other times, the lines are blurred because some organizations harbor cyber criminals.
Why would threat actors leverage locations that they could safely presume to have been previously identified in threat intelligence as hostile? The answer for this is closely related to the nature of online crime as a business venture. To some degree, economic rational choice theory proves true for online crime as much as it does with other types of businesses.
Each element of malware support requires an investment on the part of the threat actor. This may be the monetary investment of purchasing resources or it might be simply be the time and effort to compromise vulnerable hosts for malicious repurposing. In either case, the threat actor is forced to choose between expending one commodity in exchange for another. It therefore benefits threat actors to extract the maximum utility from any resource before writing it off as lost. Even if the resource has been identified by threat intelligence feeds or in block lists available to many, not all endpoints will have access to that information.
Other influential factors that may lead to infrastructure reuse include the likelihood that the resource will be dismantled or impaired by researchers or seized by law enforcement. The “bulletproof” nature of certain hosting locations is a strong motivator for reusing infrastructure. Furthermore, threat actors may be reluctant to frequently dismantle and relocate sophisticated support infrastructure due to the hassle such a process entails.
Furthermore, when delivering a new botnet malware that may very likely still be under development, a threat actor may consider older infrastructure as a staging ground for early stages of deployment. Later, the threat actor might gravitate toward other, stealthier or unidentified infrastructure.
The malware analyzed on April 25, 2016, for PhishMe Intelligence Threat ID 5907 presented novel malware samples but also served as a prime example of infrastructure reuse. The emails analyzed for this report were messages in Polish, as seen in Figure 1, claiming to deliver an order confirmation. Attached to each message was a ZIP archive containing a single Windows executable.
This executable was a sample of the Godzilla Loader, a recently-developed malware downloader notable for its distinctive command and control panel login page, shown in Figure 2 below. At runtime, this downloader was tasked with providing a sample of a sophisticated botnet malware that is discussed at length later in this piece. The first surprise encountered during this analysis was the IP address hosting the command and control resources for the Godzilla Loader malware. The IP address 141.105.69[.]251 (Mir Telematiki—Moscow), which hosted the Godzilla Loader command and control host photosalemonday[.]com, was also host to Dyre command and control resources for thirteen days in November 2015.
Dyre was known to utilize dozens of servers at any time, as part of its widely-distributed botnet support infrastructure. Many of these were used for up to a month before being retired, while others were used for far longer. The server at 141.105.69[.]251 made a complete exit from the Dyre resources following November 18, 2015, and was not seen in use by threat actors again until April 25, 2016, when used to support this Godzilla Loader instance. Figure 2 profiles the PhishMe Active Threat Reports with which this IP address was related.This prolonged interlude between uses as a malware resource related to phishing likely indicates that, while originally used by Dyre threat actors, this resource is now in use by a different group. Perhaps they are in a subset of the original actors and have maintained access to the server, or this new usage represents the transfer of older infrastructure to new parties. In either case, the original categorization of this location as hostile due to its extensive use by Dyre set a precedent for being categorized as hostile once again.
The payload obtained from this shared location is a sophisticated botnet malware exhibiting advanced persistence mechanisms and manipulation techniques that ensure its ability to run within the infected environment for extended periods of time. Internally we referered to this botnet malware as “DoppelZilla” as a nod to both its unique persistence and continued execution mechanism and its delivery using the Godzilla malware downloader. However, collaboration with Prebytes and @maciekkotowicz from Cert.pl indicated that this malware is better known as Bolek—a sophisticated malware based on repurposed KBot source code from Carberp. Empowered by this very capable codebase, the Bolek malware represents a robust criminal tool and evidences skilled malware development efforts.
The developers of the Bolek malware use many tricks to frustrate detection and analysis efforts. For example, the attackers split out some library imports to obscure the sum total functionality of the malware. In Figure 3, decompiled assembly code from Bolek reveals a push of each character in the name of the “OpenEventW” function before loading the effective address (LEA) of the function and pushing the necessary arguments, before calling the function to finally execute it. The malware developer uses this staggered library import technique to avoid the listing of imports in a way that would make that information readily available to researchers.
In addition to this malware’s clever import functionality, it also leverages a complex methodology for ensuring its persistence within an infected environment. Each time Bolek is executed on a new victim’s computer, a randomly-named file directory will be created in the Windows System32 directory containing three files. These three files are a randomly-named .exe file, a .dll, and another file with a different, also random, file extension.
The .exe executable is actually a copy of a Windows system application pulled from C:windowssystem32 directory and moved to this random directory. The .dll is one imported by the first .exe file at runtime and retains its original file name when copied from System32. This pairing of legitimate executable and its imported .dll provides the underpinning for this malware’s novel persistence capabilities. The malware relies on the “known good” nature of these executables to evade detection. In fact, many of the executables selected and copied by this malware are identified in VirusTotal as benign and listed as “safe software”. So everything is good to go and we can move on, right?
Not so fast. Once the safe executables have been copied into this new directory, the malware will modify the .dll by injecting malicious code into DllEntryPoint and other functions. Figure 5 demonstrates the difference in file size after this modification, with the original .dll’s properties displayed on the right and the modified .dll’s properties displayed on the left.
The analysis tool BinDiff helps illustrate this difference further by showing the code differences between the original and modified .dll at a subroutine level. In Figure 6, green rows evidence no code changes in a subroutine while yellow, orange, and red rows represent subroutines that have undergone extensive code modification.
The screenshot on the left of Figure 7 shows where the malicious code is injected into DllEntryPoint, leading to all registers being pushed onto the stack. This is followed by a call to code residing at another address. For comparison, the screenshot on the right shows the same section in the unmodified .dll.
During the infection process, Bolek leverages the legitimate winlogon.exe process to carry out runtime activities. During dynamic analysis, we can see that code is running using this process, including 38 megabytes of data allocated as read/write for one of the files on disk.
Once we start digging through memory, we were able to see more information about what’s really going on. As Figure 9 shows, the malware maintains certain runtime values in memory, including the C2 domains in the malware and some of the bot configurations.
The configuration data appears to be stored in JSON, and, by searching memory for the characteristic JSON double quote and colon, we can see that the malware has the ability to send varied information to the threat actors about the infected system, including versioning, file hashes, and installed applications.
Some of the functionality belied by this configuration data is very concerning, as it reveals how Bolek can be configured to do many different tasks. For example, if configured to do so, the Bolek malware has the capability to become wormable, potentially traversing from one computer to another. The attackers have also built in the ability to shift tactics mid-campaign by leveraging the ability to push updates with UpdateWormConfig.
The malware also has the capability to determine how it was infected, based on an InfectedByID, which appears to represent this information using UUID.
In an updated sample analyzed for Threat ID 6027 on May 12th, the malware injects into svchost.exe instead of winlogon.exe, while using a similar, randomly-named file in System32 as a handle.
Examination of this sample’s memory reveals a “BotCommunity” value, potentially used to group infected machines. During our analysis, the infected machine was listed as having “group_102” as its “BotCommunity”.
This analysis also revealed Bolek’s ability to leverage web injects for credential collection as well as a means for filtering out credentials they do not want.
By looking at command and control domains alone, it is difficult to assess the relationships among infrastructure leveraged by our Bolek samples. This is where Maltego and PassiveTotal come into play as tools that help inform our assessment and analysis. Figure 16 below shows what our network graph looks like at the highest level.
Other than being a pretty graph, these dots mean nothing without context. Let’s dig into them!
At the top, we have two domains represented by blue dots, which were observables in two different Bolek analyses over the past month. Both of these domains resolved back to the same IP address—the yellow dot between them—showing an overlap of hosting infrastructure shared by the two.
Focusing on the left side of the graph, we can see that both mensabuxus[.]com and knutesecos[.]com have resolved to 45[.]30.53.96. This gives us a data point that can be used to identify other domains used in similar campaigns based on the VirusTotal dynamic analysis for another Bolek sample. A snippet of that VirusTotal analysis is shown in Figure 18 below.
Passive DNS analysis of the IP address 45[.]30.53.96 has also shown that this location has been host to a number of credential phishing pages targeting CIBC and Tangerine customers. Following the connections further down the left side of our Maltego chart, Figure 19 shows how two of these CIBC and Tangerine phishing domains have also been hosted on the IP address 50[.]125.238.102.
Branching off in similar fashion, the IP address 50[.]125.238.102 relates to rogers-ca[dot]com, another fake domain which was used to harvest credentials. This domain serves as a convergence point in the middle of the chart, joining two separate forks from the top as shown in Figure 20.
Investigation of the IP address 50[.]125.238.102 by PhishMe’s intelligence team shows that it has previously hosted phishing kits and credential harvesting resources targeting customers of CIBC like the one featured in Figure 21 from May 2nd.Attackers also used 93[.]111.155.134 to host similar-themed credential phishing pages on April 18th. However, another interesting link was evidenced during the analysis of other related CIBC phishing domains resolving to 46[.]32.254.136, an IP address implicated not only in the support of phishing pages but also in the distribution and support of Android mobile device malware. These links are shown in the bottom-left portion of our Maltego chart highlighted in Figure 22.
In-depth investigation of the infrastructure supporting the Bolek malware reveals three well-delineated activities. The first is Bolek, a new variety of malware with many tricks up its sleeves and inspiration and utilities borrowed from the Kbot / Careberp source code. The second is a substantial credential phishing campaign impacting Canadian telecommunications and online banking customers. The third is the distribution and support of malware affecting users of the Android mobile device platform. Despite the noteworthy overlap of infrastructure among these three, it is difficult to determine whether the same actors are behind these interrelated attacks.
Nonetheless, the extensive re-use of infrastructure depicted in our chart of network relationships reinforces the need for actionable and high-fidelity threat intelligence. Without threat intelligence providing the context and detail needed to make informed decisions, these relationships are nothing more than disjoint data points. Understanding the nature of these relationships and their place in a coherent threat landscape puts security professionals in the best position to defend their organizations.