Fast injection attacks on LLMs

Large language models such as ChatGPT, Claude are made to follow user instructions. But following user instructions indiscriminately creates a big weakness. Attackers can enter hidden commands to control the behavior of these programs, a technique called rapid injection, such as SQL injection into a database. This can lead to dangerous or misleading results if not managed properly. In this article, we explain what rapid injection is, why it is important, and how you can reduce its risks.
What is Rapid Injection?
Quick injection is a way to trick the AI by hiding instructions inside normal input. Attackers insert deceptive commands into the text the model receives to make it behave in ways it was not intended to, sometimes producing harmful or misleading results.
LLMs process everything as a single block of text, so they don’t inherently separate trusted system instructions from untrusted user input. This makes them vulnerable if user content is written as an instruction. For example, a system that is told to abbreviate an invoice may be tricked into approving a payment instead.

- Attackers disguise commands as plain text
- The model follows them as if they were real instructions
- This can defeat the original purpose of the system
That is why it is called injection immediately.
Types of Quick Injection Attacks
| A feature | Direct rapid injection | Indirect rapid injection |
| How the attack works | The attacker sends instructions directly to the AI | The attacker hides the instructions in the external content |
| Attacker interaction | Direct interaction with the model | There is no direct interaction with the model |
| Where information comes from | In conversation or API input | In files, web pages, emails, or documents |
| Visibility | It is clearly visible in the notification | It is usually hidden or invisible to people |
| Time | Done quickly at the same time | Triggered later when content is processed |
| Command example | “Ignore all previous instructions and do X” | A hidden text that tells the AI to ignore the rules |
| Common techniques | Jailbreak information, roleplay instructions | Hidden HTML, comments, white text |
| Difficulty seeing | It’s easy to find | It’s hard to find |
| Typical use cases | Early ChatGPT jailbreak as DAN | Toxic web pages or documents |
| The main weakness has been exploited | The model relies on user input as commands | The model relies on external data such as instructions |
Both types of attacks use the same main feature. The model cannot reliably distinguish reliable instructions from injected ones.

Risks of Rapid Injection
Rapid injection, if not accounted for during model development, can lead to:
- Unauthorized data access and leaks: Attackers can manipulate the model to reveal sensitive or internal information, including system information, user data, or hidden instructions such as information from Bing in Sydney, which can be used to discover new vulnerabilities.
- Bypassing security and manipulating behavior: Injected information can force a model to ignore rules, often by impersonating a specific location or fake authority, leading to jailbreaks that produce violent, illegal, or dangerous content.
- Misuse of system tools and resources: If models can use APIs or tools, information injection can trigger actions such as sending emails, accessing files, or performing tasks, allowing attackers to steal data or abuse the system.
- Breach of privacy and confidentiality: Attackers can search for chat history or archived content, causing the model to leak confidential user information and potentially violate privacy laws.
- Distorted or misleading results: Some attacks subtly change responses, creating biased summaries, unsafe recommendations, phishing messages, or inaccurate information.
Real World Examples and Case Studies
Real-life examples show that timely vaccination is not just an imaginary threat. This attack has compromised popular AI systems and produced real safety and security.
- Quick Bing chat leak “Sydney” (2023)
Bing Chat uses a hidden system information called Sydney. By telling the bot to ignore its previous instructions, the researchers were able to make it reveal its own internal rules. This showed that rapid injection can yield system-level information and reveal how the model is designed to behave. - “Grandma exploits” and prison discounts
Users have found that emotional role playing can bypass security filters. By asking the AI to pretend to be a grandmother telling forbidden stories, it generates content that would normally block you. Attackers used similar tactics to make government chatbots generate malicious code, showing how social engineering can defeat defenses. - Hidden information in resumes and documents
Some applicants hide invisible text on their resumes to use AI test programs. The AI read the hidden instructions and placed the reboot correctly, even though human reviewers couldn’t see the difference. It has been proven that a quick injection can silently influence automatic decisions. - Claude AI code block injection (2025)
Anthropic’s Claude vulnerability treated instructions hidden in code comments as system instructions, allowing attackers to issue security rules through programmatic input and proving that rapid injection is not limited to plain text.
All of this together shows that early vaccination may result in spilled secrets, compromised security controls, unsafe judgments and unsafe deliveries. They point out that any AI system exposed to untrusted input can be vulnerable if proper defenses are not in place.
How to Protect Yourself from Quick Injections
Rapid injection is difficult to fully prevent. However, its risks can be minimized by careful system design. Effective defenses focus on controlling inputs, limiting the power of the model, and adding layers of security. No single solution is enough. A layered approach works best.
- Sanitization and verification
Always treat user input and external content as untrusted. Filter the text before sending it to the model. Remove or minimize command-like phrases, hidden text, markup, and encrypted data. This helps prevent injected implicit commands from accessing the model. - Clear information structure and parameters
Separate system instructions from user content. Use delimiters or tags to mark text that can be trusted as data, not commands. Use system and user roles if supported by the API. A clear structure reduces confusion, although it is not a perfect solution. - Less privileged access
Limit what the model is allowed to do. Only grant access to the most needed tools, files, or APIs. It requires guarantees or human authorization for critical actions. This minimizes damage in the event of a rapid injection. - Output monitoring and filtering
Do not assume that the output of the model is safe. Scan responses for sensitive data, secrets, or policy violations. Block or disable harmful content before users see it. This helps contain the impact of a successful attack. - Fast segmentation and context segmentation
Separate untrusted content from core system logic. Process external documents in limited circumstances. Clearly label the content as untrusted when transferring it to the model. Compartmentalization limits how far the injected instructions can spread.
In practice, protection from immediate injection requires intensive protection. Integrating multiple controls greatly reduces risk. With good design and awareness, AI systems can always be useful and safe.
The conclusion
Fast injection exposes a real weakness in modern language models. Because they treat all input as text, attackers can intercept hidden commands that lead to data leaks, unsafe behavior, or bad decisions. Although this risk cannot be eliminated, it can be reduced through careful design, layered defenses, and continuous testing. Treat all external inputs as untrusted, limit what the model can do, and watch its results closely. With the right safeguards, LLMs can be used safely and responsibly.
Frequently Asked Questions
A. This is when instructions hidden within user input trick the AI into behaving in an unintended or dangerous way.
A. They can leak data, bypass security rules, misuse tools, and produce misleading or dangerous results.
A. By treating all inputs as untrusted, limiting model permissions, specifying structure clearly, and monitoring outputs.
Sign in to continue reading and enjoy content curated by experts.



