Skip to main content

Prompt Injection

Definition

Prompt injection is a type of attack against artificial intelligence models, particularly large language models (LLMs), where malicious input is crafted to override or manipulate the model’s intended instructions or safety guidelines. Attackers insert hidden directives within prompts to steer the AI into performing unintended actions, generating harmful content, or revealing sensitive information. This exploits vulnerabilities in how AI models interpret and process user input.