![]()
Sign up for Big Think on Substack
The most surprising and impactful new stories delivered to your inbox every week, for free.
In June 2023, just seven months after OpenAI first invited curious tech fans to try a “research preview” of its now-ubiquitous ChatGPT tool, a lesser-known chatbot called WormGPT officially launched with a much different target audience: hackers.
Its creator offered would-be customers access to a large language model (LLM) with no built-in guardrails — one that wouldn’t push back when asked to do something nefarious, like craft a scam email, write code for malware, or help with a phishing scheme. He later claimed that more than 200 users paid upwards of €500 (around $540) per month for the tool, with some shelling out as much as €5,000 ($5,400) for a full-featured private installation.
WormGPT officially shut down just months after its launch, around the same time that security researcher Brian Krebs wrote a lengthy exposé that revealed the identity of its creator, Rafael Morais. Morais, who said that the majority of WormGPT wasn’t coded by him specifically, claimed the tool was meant to be neutral and uncensored, not explicitly malicious. He was never charged with a crime, and it’s unclear how much damage, if any, WormGPT’s users inflicted upon the world.
In the two-plus years since, unrestricted AI models and AI-powered cybercrime tools — loosely grouped together under the term “dark AI” — have grown in both number and popularity, with creators primarily using the dark web to connect with their target customers: people keen to cause mischief, run scams, or steal information and identities for profit. FraudGPT, which launched just a month after WormGPT, reportedly tallied over 3,000 paying subscribers, while DarkGPT, XXXGPT, and Evil-GPT have all enjoyed varying levels of success. The WormGPT name itself has been hijacked by other dark AI models as well, including keanu-WormGPT, which utilizes a jailbroken version of X’s Grok.
So, what do we know about dark AI — and what can we do about it?
Dark AI vs. misused AI
Mainstream generative AI tools have guardrails against malicious uses, but they also have vulnerabilities that allow people to bypass these guardrails.
If you ask a current version of ChatGPT to generate a template for a scam email, for example, it will politely decline. However, if you explain that you’re writing a fictional story about a scammer and want to include an example of an email this not-at-all-real person might create, it will happily generate one for you. Inter-agent trust — the tendency for an AI agent to trust other AI agents by default — can also be exploited to get around the guardrails built into popular systems. A study shared on the preprint server arXiv in July 2025 found that 14 of the 17 state-of-the-art LLMs it tested were vulnerable to this type of exploit.
“Anyone with a computer … plus some technical know-how, can self-host an LLM and fine-tune it for a specific purposes.”
Crystal Morin
Tech-savvy criminals can also use the many publicly available open-source LLMs as the basis for their own unrestricted models, training them on vast collections of malicious code, data from malware attacks and phishing exploits, and other information that only a bad actor (or cybersecurity researcher) is likely to find valuable.
“Anyone with a computer, and especially with a GPU, plus some technical know-how, can self-host an LLM and fine-tune it for a specific purpose,” says Crystal Morin, a senior cybersecurity strategist with Sysdig and a former intelligence analyst for the United States Air Force. “That’s exactly how threat actors are bypassing the safeguards built into the most popular public models.”
“I know security practitioners, for instance, who experiment with local models, adapting them to various use cases,” she adds. “They haven’t been able to find a task where AI can’t deliver some kind of workable result.”
Fine-tuning an open-source model or bypassing a chatbot’s AI safety checks takes at least a cursory understanding of how these systems work. The dark AI models made available to threat actors online are more dangerous because they lack guardrails entirely — there’s nothing to circumvent — and have already been set up for malicious uses. A low-level criminal doesn’t need much technical skill to take advantage of these AI tools — they just need to be able to write a straightforward prompt for whatever they want, and the dark AI will deliver it.
Fighting fire with fire
Security analysts have been warning companies, governments, and the general public that hackers are ramping up their attacks in the wake of the widespread adoption of AI. The statistics support their claims: In the past two years, ransomware attacks have spiked, cloud exploits have increased, and the average cost of a data breach has hit an all-time high.
The simple fact that generative AI allows users to do more in less time means that hacking is now more efficient than ever, and the unfortunate truth is that we can’t unring the AI bell.
“AI threats will persist, just as attack[er]s will keep innovating — that’s just what they do,” Morin says. “Some cybercriminals even hold day jobs in cybersecurity, so they have the same skillsets and know defensive security inside and out. What matters is how defenders evolve and respond.”
But just as dark AI is making it easier for bad guys to launch attacks, other AI tools are helping security experts take the fight head-on.
“Cybersecurity has always been an arms race, and AI has just raised the stakes.”
Crystal Morin
Microsoft, OpenAI, and Google are among those actively developing AI tools to prevent AI — in some cases, models they themselves developed — from being used for ill.
Microsoft Threat Intelligence recently shut down a large-scale phishing campaign that it believes was carried out with the help of AI, and OpenAI took its fight to AI-generated images with a tool security researchers can use to detect fake photos. Google spent much of this past summer highlighting AI-based tools developers could use to prevent AI from negatively impacting users. Google DeepMind, meanwhile, has demonstrated that proactive defense against AI-based threats works in the real world with Big Sleep, an AI that tests systems for vulnerabilities. It’s already identified glaring holes in popular software, including the Chrome web browser, and its success suggests that widespread automatic patching of security flaws may be just around the corner.
Red teaming — a practice where ethical hackers test a system for vulnerabilities so they can be addressed before a real attack — dates back to the Cold War, but it’s taken on new meaning in generative AI circles. Cybersecurity experts now run complex simulations where AI is assigned the role of a toxic attacker, allowing organizations to test how their own AI systems react to provocation.
AI is also adept at pattern recognition, which gives AI-based protection systems an advantage in spotting things like scam email campaigns and phishing attacks, which can be repetitive. Companies can deploy these tools to keep their employees safe from intrusion attempts, while email and messaging providers can integrate them into their systems to prevent spam, malware, and other threats from ever reaching users.
“With attackers moving at the speed of AI, we have to adopt a real-time ‘assume breach’ mentality to stay ahead — using trustworthy AI of our own,” Morin says. “Cybersecurity has always been an arms race, and AI has just raised the stakes.”
Even the most sophisticated AI-based defenses face a fundamental and troubling challenge: the law itself. Stopping the flood of new attacks is only possible by targeting the source, and legal frameworks are still catching up to the modern tech age, especially as it relates to AI.
Stripping the safeguards off an AI and training it to help you (or someone else) write malware or craft a convincing phishing email might be unethical, but it’s not necessarily illegal — AI researchers and cybersecurity analysts do it as part of their job. While the law attempts to make a clear distinction between “good faith” development and malicious end-use intent, it’s far from clear cut.
Creating malware and sending scam emails can land you behind bars, but creating an AI that can do these things isn’t a crime, at least according to federal law. It’s not unlike buying a radar detector for your car. Owning the device isn’t illegal, but in certain places, being caught using one is a crime. This may be why there have been no high-profile convictions of dark AI creators and sellers up to this point — the focus for law enforcement remains on those who are using these tools for nefarious means.
The race continues
The emergence of dark AI is a troubling new development in cybersecurity, but it’s not an unprecedented one. The history of digital security has been defined by innovative defenses rising up to address ever-more-advanced threats. AI-powered attacks are the next chapter.
What makes this moment unique is the speed at which both sides are advancing. Criminals can now launch large-scale attacks with virtually no legwork, and defenders can employ real-time AI to spot the risks and shut them down before they can impact users. Every time this happens — and it’s happening constantly — both sides get a little bit smarter.
We might not be able to unstrike the bell, but we can work toward a future where, for every nefarious AI tool or unhinged LLM that lands on a dark web forum, there’s an innovative and timely defense to neutralize it.
Sign up for Big Think on Substack
The most surprising and impactful new stories delivered to your inbox every week, for free.
