
Security researchers and academics have discovered a new attack method where AI-assisted code generators recommend hallucinated programming packages that attackers then register and weaponize to go after already vulnerable supply chains.
With the emergence of AI-powered code assistant tools like Copilot, ChatGPT, and Cursor that are seen as productivity enhancers now used to help write everything from web apps to automation scripts. However, productivity gains are not without risks, with one such risk being Slopsquatting.
AI Hallucinations and Slopsquatting
Before diving deeper into Slopsquatting, a brief look at the phenomena of AI Hallucinations wouldn’t hurt. IBM defines AI Hallucinations as follows,
“AI hallucination is a phenomenon wherein a large language model (LLM)—often a generative AI chatbot or computer vision tool—perceives patterns or objects that are nonexistent or imperceptible to human observers, creating outputs that are nonsensical or altogether inaccurate.”
In practice, when a user makes a request of a generative AI tool, they desire an output that appropriately addresses the prompt. The end user has come to expect a correct answer to a question. However, sometimes AI algorithms produce outputs that are not based on training data, are incorrectly decoded by the transformer or do not follow any identifiable pattern. In other words, it “hallucinates” the response. This hallucination may be accepted as fact by an unwary user, resulting in unforeseen consequences.
As alluded to above, Slopsquatting occurs when the LLM hallucinates the existence of non-existent developer tools. A threat actor will then register the non-existent developer tool or commonly referred to as a package and use it as a malware delivery vehicle. Once the at one time hallucinated non-existent package, now a malicious package, is downloaded and implemented by a developer, a threat actor has potentially secured highly privileged access to a code base.
The term Slopsquatting was popularized by Andrew Nesbitt after it was coined by PSF Developer-in-Residence Seth Larson. In a recent academic paper titled, “We Have a Package for You! A Comprehensive Analysis of Package Hallucinations by Code Generating LLMs”, academics argue that LLMs are capable of producing package hallucinations at scale, presenting a significant security challenge to the already difficult task of securing supply chains.
LLMs Are Hallucinating Packages at Scale
In the above-mentioned research paper, 16 leading code-generation models, both commercial (LLMs like GPT-4 and GPT-3.5) and open source (LLMs like CodeLlama, DeepSeek, WizardCoder, and Mistral), were used to generate a total of 576,000 Python and JavaScript code samples. Here are just some of the paper’s key findings:
- 19.7% of all recommended packages didn’t exist.
- Open source models hallucinated far more frequently, 21.7% on average, compared to commercial models at 5.2%.
- The worst offenders (CodeLlama 7B and CodeLlama 34B) hallucinated in over a third of outputs.
- GPT-4 Turbo had the best performance, with a hallucination rate of just 3.59%.
- Across all models, the researchers observed over 205,000 unique hallucinated package names.
Additionally, it was found that executing the same previously hallucination-triggering prompt ten times resulted in 43% of hallucinated packages were repeated every time, while 39% never reappeared at all. This suggests that hallucinations are either highly stable or entirely unpredictable. However, when considering, 58% of hallucinated packages were repeated more than once across ten runs, indicating that a majority of hallucinations are not just random noise, but repeatable artifacts of how the models respond to certain prompts.
It is this repeatability that drastically increases their value to attackers, making it easier to identify viable Slopsquatting targets by observing a smaller number of model outputs. In summary, given the reliability of a hallucinated-package can occur, attackers do not need to scrape massive prompt logs or brute force potential names. They can simply observe LLM behavior, identify commonly hallucinated names, and register them.
Supply Chain Risks
Socket has stated regarding the risks posed by Slopsquatting to supply chains,
“Package confusion attacks, like typosquatting, dependency confusion, and now slopsquatting, continue to be one of the most effective ways to compromise open source ecosystems. LLMs add a new layer of exposure: hallucinated packages can be generated consistently and shared widely through auto-completions, tutorials, and AI-assisted code snippets…This threat scales. If a single hallucinated package becomes widely recommended by AI tools, and an attacker has registered that name, the potential for widespread compromise is real. And given that many developers trust the output of AI tools without rigorous validation, the window of opportunity is wide open.”
The advent of vibe coding, a programming approach that leverages AI to build applications by describing what you want in natural language, rather than writing code lines, adds additional risk in regard to Slopsquatting as developers relying on vibe coding may never manually type or search for a package name. If the AI includes a hallucinated package that looks plausible, the path of least resistance is often to install it and move on.