Using AI to automate phishing.
Phishing is when scammers send you an email or SMS or some other direct message (for this experiment, they used only email) pretending to be from a bank, government agency, or some other source that you trust, to get you to click on something that will install a virus or get you to go to a website and enter passwords, credit card numbers, or other sensitive information. As such it is a form of "social engineering". Phishing becomes "spear phishing" when the phishing is personalized -- the email or other direct message was written for you, and you specifically, as an individual, not spammed to everyone in your organization or somesuch.
Spear phishing using AI is done with the following process:
- "Reconnaissance of target individuals and groups of individuals. This part uses GPT-4o by OpenAI in an agent scaffolding optimized for search and simple web browsing."
- "A prompt engineering database. The prompts are currently written by human experts but could be AI-written and updated based on the tool's continuous learning."
- "Generation of phishing emails based on the collected information about the target and the chosen attacker profile and email template. Our tool currently sup ports language models from Anthropic, OpenAI, Meta, and Mistral." "We primarily used GPT-4o and Claude 3.5 Sonnet."
- "Sending of phishing emails with multiple options for delivery."
- "Live tracking of phishing success. To track whether a user clicks a link, we embed a unique, user-specific URL that redirects to a server logging each access."
"This process of collecting and analyzing publicly available information from various sources is referred to as Open Source Intelligence (OSINT), which forms the foundation of our reconnaissance methodology."
"We implemented an iterative search process using Google's search API and a custom text-based web browser to collect publicly available information about potential targets. Typical sources of data are social media, personal websites, or workplace websites. The tool concludes its search based on the quality and quantity of discovered information, which typically occurs after crawling two to five sources. The collected data is compiled into a profile."
"The emails were created and sent autonomously by the AI tool without requiring human input. After extensive internal testing between different models, we concluded that Claude 3.5 Sonnet produced the results that best satisfied the conditions of credibility and relevance, as well as best conveyed the influence principles from Cialdini [48]. We encourage other research to continue comparing the deceptive success rate between different language models."
"Each AI-generated email was analyzed in hindsight and categorized based on whether we would have liked to change anything to improve the reconnaissance or the email's credibility or relevancy. Based on the desired updates, the emails were given a score."
"Our tool generates personalized emails by prompting a language model with specific prompt templates and target profiles. Each prompt template provides the model with detailed instructions, including the desired writing style, key elements to include, and how to embed URLs in an email. The subject line and body structure are dynamically determined by the tool on a case-by-case basis to best fit each unique target. We also provide the current date to the tool to enable the model to incorporate relevant deadlines when appropriate. To ensure the tool generates emails that are credible and relevant, we invested significant effort in prompt engineering. Through extensive testing and feedback, we developed a sophisticated prompt template exceeding 2,000 characters, carefully designed to maximize the persuasiveness of the generated emails."
The AI-generated messages had a 54% success rate. For comparison, they also sent out messages written by human experts, and human+AI hybrid messages. The human experts also had a 54% success rate, and the human+AI hybrids had a 56% success rate. They had a control group with a 12% success rate, said to consist of "ordinary phishing emails", though it's not clear to me what those were or how they were targeted. The AI-assisted emails were "hyper-personalized", meaning they knew the target person's latest projects, specific interests, collaborators, acquaintances, and so on.
At any rate, it seems we can conclude that AI spear phishing emails are just as effective as having humans -- expert humans, skilled at persuasion -- scour the internet for information about you and writing a personalized attack email just for you.
Interestingly, they also tried AI-assisted intent detection.
"We tested five different language models (GPT-4o, Claude 3.5 Sonnet, Mistral Large, Gemini, and Llama 3.1 405B) for their capability to detect the intention of phishing emails. All tests were performed in Q3 2024. Each model was prompted with 20 emails: five control group emails (spam emails from our inboxes), five human expert emails following the best practices of the V-triad, five AI-generated emails from our tool, and five legitimate emails (also from our inboxes). The legitimate marketing emails act as an additional control group to compare how well the model can detect the intention of benign emails. We investigated the models' detection rate for classifying the intention of the email and whether it can detect anything suspicious in the email. Our rationale for doing so is that humans are more attentive when specifically asked whether they can identify anything suspicious with an object, such as an email, than when they're going through their daily lives, such as browsing their email inbox on a regular work day. "
"We updated our tool with the ability to perform automated intent detection. We ran it on a larger dataset of different phishing emails from nine different categories. Out of the 381 emails in this dataset, 18 belong to the legitimate category and are not phishing or spam, while the other 363 emails belong to various categories of phishing. Six of the nine email categories contained AI-generated emails. The 'AI combined' group contains the AI-generated emails sent in this study. The remaining five AI-generated email groups were created using different AI models and targeted synthetic users randomly generated by our tool. Each synthetically generated user had an extensive profile."
"Claude 3.5 Sonnet demonstrated good performance in identifying sophisticated phishing attempts while maintaining a low false-positive rate. When primed for suspicion, it correctly detected all phishing emails from categories as suspicious while also correctly classifying all legitimate emails as benign. Some models, like Mistral, suffered from extensive false positives when primed for suspicion."
Evaluating Large Language Models' Capability to Launch Fully Automated Spear Phishing Campaigns: Validated on Human Subjects
#solidstatelife #ai #genai #llms #deepfakes #cybersecurity #phishing