AI Alone is Not Bulletproof: Weaknesses in AI/ML Email Security

January 15, 2025

Author: Jacob Malimban, Cofense Intelligence Team

Despite modern secure email gateways (SEGs) embracing AI capabilities, many phishing emails still reach users' inboxes. Therefore, employees need proper training to be able to identify attacks. This is necessary because artificial intelligence and machine learning (AI/ML) models are trained on past data, which may not relate to future threats. Also, most threat actors are creative and able to identify working strategies to bypass SEG security—and they are using AI offensively. However, Defensive AI does have its place though, as it can identify rote patterns and filter bulk data. This allows teams to focus on ambiguous emails with no clear identification as being malware or legitimate, which can then be escalated to a human for review. By combining the power of AI speed and human ingenuity, Cofense gains the benefits of both.

Key Points

  • AI can detect phishing emails based on known templates, but not all of them.
  • SEG bypass techniques continue to circumvent AI models.
  • Offensive AI used by attackers will be better developed (compared to defensive AI) because of no legal, copyright, or ethical constraints.
  • See Cofense’s SEG-Miss Database for examples of SEGs, including AI-based models, being bypassed.

What Phishing Can Most AI Detect?

Bad grammar, manipulative language, and unsolicited communication are typical signs of phishing emails. Just as humans can learn these identifiers, AI can train to learn suspicious features and build a general “model.” Such trained AI can then categorize new emails as either malicious or benign based on how similar they are to the learned model. The effectiveness of such models can be seen with spam filters. Before spam filters, inboxes were cluttered with unsolicited emails. Nowadays, most emails caught by the AI model are sent to junk; although sometimes legitimate emails are also marked as junk if they look sufficiently suspicious.

Emails with these obvious tells are likely to decrease as attackers start using AI too. Proofreading AI programs can take grammatically incorrect sentences and make them less suspicious. In the past, attackers have been limited to creating professional phishing emails in their proficient languages. As translation AI improves, language may become less of a barrier and their potential targets will likely increase. AI can also be trained on emails of specific industries, so natural jargon and technical terms are likely to increase in phishing emails going forward.

AI technology can recognize patterns inherent to suspicious URLs if properly trained. Typosquatting is one such pattern. AI can learn what letters are easy for humans to mistype or misread. Sites such as vvindows[.]com and wallrnart[.]com mimic known brands, but AI trained on real typosquatting domains should correctly categorize such sites as suspicious. Another pattern to consider is the website content. For example, the website title, input fields, and graphics may closely match that of a known legitimate site—except, the domain does not belong to a legitimate company. AI can also consider other domain factors. Sites that are newly registered can be seen in phishing attacks today, although compromised sites were the type most commonly used according to our historical analysis. AI can use both the age and category (social media, gaming, news, etc.) of the site to calculate how likely it is that the site is malicious. AI that has been properly trained with images can also recognize malicious sites based on their appearance and resemblance to pages that are known to be frequently spoofed.

Bypassing AI Security

Although AI can help, it can also hurt organizations. Even defensive AI can cause unintended harm if it is not properly trained and monitored. SEGs using defensive AI may block urgent emails that request a user to reset their account if the email resembles phishing. Attackers using AI can cause far more damage, though. Phishing emails can easily become more professional and industry-specific using large language models (LLMs) with minimal effort or time investment on the part of the threat actors. Automated tools can find open-source information on the targeted employee or company to add detailed personalization. Attackers can compromise accounts then train AI to copy the victim’s writing style. Combined with using the compromised account to reply to preexisting email threads, new targets may be less vigilant than usual. A small study by Singapore's Government Technology Agency showed how AI-enhanced phishing tended to be more effective at enticing users to click URLs and enter credentials.

Other concerns also include more spearphishing and better deepfakes, among others. Spearphishing can become significantly less costly and more tailored when threat actors take advantage of AI and ML. This can be achieved by feeding the position, tools, and industry of the target to an LLM. Deepfakes can mimic both the speech and appearance of trusted contacts. Threat actors are already taking advantage of the relative ease of utilizing deepfakes, with one attack costing a company $25 million. In this case, it appears the attackers mimicked the CFO and other employees in a video conference call to convince the victim to transfer the funds. When presented with content virtually identical to daily business communications, it can become harder for both employees and AI to find the usual indicators in malicious emails.

User Interaction

AI-Alone-is-not-Bulletproof-Weaknesses-in-AI-ML-Email-Security-Figure-01.png 

Figure 1: Notification-themed email from August 2024 with malicious URLs embedded in QR codes to bypass AI SEGs.

Techniques that require action from the user can be harder for AI to automatically analyze. One such tactic is the use of QR codes. This attack type requires the employee to scan the QR code with their phone, which removes the protections typically associated with computer Endpoint Detection and Response (EDR) or other security. Compared to phones, devices like enterprise laptops usually have more mature controls like EDR, Access Control Lists (ACL), and firewall deep packet inspection. Avoiding these protections is one reason why QR codes are used in phishing. Although there are ways to automatically obtain the website in QR codes for analysis, QR codes still appear effective as SEG bypass tools. This is why Cofense offers security awareness training and automatic response tools to remediate QR code threats. Such solutions allow for end-to-end security and help develop defense-in-depth capabilities.

 AI-Alone-is-not-Bulletproof-Weaknesses-in-AI-ML-Email-Security-Figure-02.PNG

Figure 2: Legitimate URL part of a credential phishing delivery chain embedded in finance-themed emails in October 2024 to bypass SEGs.

Another technique that bypasses SEG defenses is the practice of embedding a malicious link or QR code in an attached PDF or Microsoft Office document file. As these files are normal parts of business communication, it may be hard to separate normal PDFs from ones that lead to malicious content. Employees can be tricked into clicking an image in a PDF that redirects to download the “unencrypted” invoice. Office documents can also be misused this way.

Attackers continue to use CAPTCHA to make automated analysis difficult. Users may be annoyed having to click and prove they are human, but they may also feel a false sense of security from familiar CAPTCHAs. Thus, threat actors updated their tools to include CAPTCHAs. Phishing kits for impersonating legitimate login pages typically make it easy to add CAPTCHAs for the same reason companies do: to ensure only humans can access the content and not bots. Although CAPTCHAs may be enough, attackers usually combine these techniques. This means the infection chain is easy for humans to utilize but hard for AI to access—like a QR code in a PDF that leads to a site protected with CAPTCHA.

URL Obfuscation

There are many ways attackers perform URL obfuscation or misdirection. Cofense Intelligence currently sees four of them being used to bypass SEGs: open URL redirects, using SEG-encoded URLs to bypass other SEGs, exploiting trusted email marketing services, and attaching malicious HTML files. Although these URL techniques are not new, SEGs fail to block all malicious links.

Open URL Redirects

AI-Alone-is-not-Bulletproof-Weaknesses-in-AI-ML-Email-Security-Figure-03.PNG

Figure 3: Google AMP link that redirects to a credential phishing site used in notification-themed emails.

URL redirection for phishing typically involves the attacker hiding their malicious content behind a trusted service. The first example is Google AMP. Here, attackers use the real Google domain such as hxxps[://]www[.]google[.]com/amp/ but append a malicious site for Google to automatically redirect to. When a victim clicks on the carefully crafted Google link, they likely don’t expect to be sent to a credential phishing site. Attackers continue to use this technique because it still bypasses current SEG security.

SEG-encoded URLs

AI-Alone-is-not-Bulletproof-Weaknesses-in-AI-ML-Email-Security-Figure-04.PNG

Figure 4: Credential phishing URL encoded by VIPRE Email Security’s url[.]emailprotection[.]link.

The second example is using SEG-encoded links. SEGs typically rewrite URLs in emails sent to their customers. This allows the SEG to first redirect the user to a holding page, then block links they scan as malicious. This is useful for restricting access later to a malicious site if the scan returns a false negative result. If a SEG receives an email containing a URL that is encoded by another SEG, it appears the AI model usually determines the URL to be safe—even if the final URL is malicious.

Email Marketing Services

AI-Alone-is-not-Bulletproof-Weaknesses-in-AI-ML-Email-Security-Figure-05.PNG

Figure 5: Marketing URL that bypasses AI SEG security and delivers a credential phishing page.

The third example is via email marketing services. Attackers tend to use services such as MailChimp, HubSpot, SendGrid, and others because legitimate organizations also use these services to manage their email lists. Email marketing can come with built-in tracking, whereby recipients who click on links are highlighted among the others. For threat actors, this can be another method for choosing targets without having to add tracking functionality themselves.

Malicious HTML Files

AI-Alone-is-not-Bulletproof-Weaknesses-in-AI-ML-Email-Security-Figure-06.PNG

Figure 6: HTML file masquerading as an mp4 invoice. Encoded Javascript redirects to credential phishing page.

The fourth example is via attached HTML files. Threat actors may choose to attach an .html file in an archive instead. Attackers can have the victim be redirected to a credential phishing site when they execute the HTML file. This is usually because the .htm(l) file contains an encoded <script>. Executable code like JavaScript is typically used to make interactive web pages, but attackers can code something malicious instead. Alternatively, the threat actors don’t need to redirect the victim. They can instead have the attached .html be the credential phishing page already. When run, this HTML file will show a login page crafted by the attacker. The file is already set to exfiltrate user credentials when entered.

Other Techniques

In general, other SEG bypass techniques used by attackers can either involve manipulating legitimate services or using a novel technique unknown to the AI SEG. The websites of legitimate services cannot be unilaterally blocked because they are likely used throughout the organization. Blocking a critical external service will disrupt business processes unnecessarily. Novel SEG bypass techniques are new ways to have malicious content be accepted and delivered by the AI SEG. This can be considered the email equivalent of a zero-day computer vulnerability. As the AI has not learned to block those kinds of threats, employees play a key part in a defensive in-depth strategy.

Cloud Storage: Google Drive, OneDrive

Free cloud services like Google Drive and OneDrive are abused by threat actors. These tools are often used in normal business functions for file sharing and document collaboration, but attackers abuse these tools to distribute malicious content such as malware or credential phishing. An example of cloud-distributed malware is the recent Poco RAT campaign targeting Spanish victims. Redirect files like .lnk may appear as an invoice but will redirect to a credential phishing page when interacted with. Both the AI SEG and the employee may trust the cloud storage provider to only provide legitimate content. Attackers may be keen to compromise a legitimate business account and send fake, malicious Requests for Proposals.

Content Hosting: AWS, WordPress

Other legitimate platforms abused by threat actors to bypass SEGs include AWS and WordPress, to name a few. This is because the cost to use the platform is relatively inexpensive and the malicious site is unlikely to be immediately blocked by AI SEG security. AWS, WordPress, and other platforms are all susceptible to abuse. These platforms can host credential phishing sites, serve as exfiltration endpoints, and be part of a threat actor’s attack infrastructure. As AWS is widely used for cloud services (even by one’s organization) SEGs may be less willing to block content hosted on AWS. Traffic to these sites may be harder to parse as malicious and appear just like similar legitimate traffic. Threat actors can create websites on these platforms that redirect to other sites and collect credentials with a form—all with the free plan.

GitHub

AI-Alone-is-not-Bulletproof-Weaknesses-in-AI-ML-Email-Security-Figure-07.PNG

Figure 7: Malware uploaded via comment outside of a legitimate repository in GitHub.

Given the prevalence of open-source software in enterprise environments, it may be hard for organizations to limit access to GitHub to trusted repositories. The SEGs likely have a hard time too. They would have to differentiate between legitimate GitHub file sharing and threat actors using GitHub to conceal infected content. Even if the SEGs successfully block all harmful repositories, new techniques like malicious GitHub comments can associate malware with trusted repositories without affecting the real codebase. Thus, human intelligence is necessary to determine if the file fits the context.

Unusual File Types

AI-Alone-is-not-Bulletproof-Weaknesses-in-AI-ML-Email-Security-Figure-08.PNG

Figure 8: Malicious files within a zipped .vhd file deemed safe by SEGs in July 2024 shipping-themed emails.

Attackers continue to develop novel techniques. Virtual hard drive files can contain malware accessible when the .vhd(x) is mounted. SEGs appear able to detect when there are virtual hard drive files in zip files, but they appear unable to detect the malicious content within those .vhd files. This may be because many SEGs or their antivirus solutions do not handle virtual hard drive scanning. For AI SEGs, they likely need time to learn to block this file type when used to deliver malware.

Another example of an unusual and abused file type is SVG files. When a user downloads an SVG email attachment to view an “invoice”, the browser executes the malicious commands. One such chain of commands would be to download a malicious archive containing JavaScript that eventually executes AgentTesla. Likely due to the unusual file type and the encrypted commands, .svg files in archives can reach user inboxes despite containing malicious instructions.

Zip File SEG Exploit

SEGs are usually able to block archive files attached to emails if the archive contains a suspicious file, but threat actors have found a bypass. Attackers can evade SEG security and deliver malware by editing the zip file header and calling the malicious HTML a “mpeg” instead. Other popular tools that mischaracterize the file include PowerISO and 7zip, as noted in our analysis. The error of both the SEGs and the archive tools appears to be because of how the archive file is parsed. This means that organizations are susceptible to infections like ransomware because the initial malicious delivery via an attached archive is not properly blocked.

Cofense reported that other archive file types are used maliciously as well. Update KB5031455 allows Windows computers to extract files from archives like .tar and .rar without needing additional software. Although zip files are still the most prevalent, .lz, .img, and .tgz archives are increasingly being used to deliver malware by email successfully. Those file types (among others) can bypass SEGs despite containing malicious files. Password-protected archives are effective too. It seems that AI-based SEGs also have a hard time extracting the password to analyze the contents of an archive file. Examples can be seen in Cofense’s SEG miss database. For more information on the history, analysis, and bypass of archive files, consider reading this report on archive files.