Prompt engineering falls within the field of LLMs (Large Language Models). There are two levels of Prompt Engineering: system prompts and user prompts. Unfortunately, there are some dangers associated with malicious user prompts. This is especially important when an LLM is utilised in a business context.
Why are some user prompts dangerous?
For most conversational LLM applications, the user prompt is expected to be a question regarding a specific subject. However, not all users have good intentions. Malicious actors may attempt to use prompt injection methods to overwrite the system prompt, effectively hijacking the LLM application.
What is prompt injection?
Prompt injection is intentionally crafting input queries or user prompts for AI systems in a deceptive or harmful manner. The goal is to manipulate the AI system responses, generating biased, offensive, or unintended content. These prompt injections could also be utilised to exploit vulnerabilities in the system.
The Open Worldwide Application Security Project (OWASP), to provide a robust foundation for the safe and secure utilisation of LLMs, has put AI prompt injection attacks first on its Top 10 for LLMs list. Luckily, prompt injection defence is an active area and there are standard best practices to defend against these malicious injections.
What are the standard best practices of defence against prompt injection?
There are two primary approaches to crafting a strong and safe system prompt.
(1) The first is to craft a well-defined and strategic system prompt. Providing a system prompt to an LLM that stipulates its primary realm, task, context, and style will make it more robust against attacks.
The different ways of crafting a well-defined system prompt include:
- Specifying the role and the task of the LLM
- Using instructive modal verbs
- Delimiting the instructions
(2) Add intentional (and strategic) rules and procedures to the LLM that can strengthen the system prompt against malicious actors by sanitising the user prompt from unwanted capabilities. It is advised to consider both of these strategic approaches when developing your system prompt.
The different ways of sanitising the user prompt include:
- Reducing the input size of the user prompt
- Moderating the input and output
- Implementing the principle of Least Privilege
Methods of crafting a strategic system prompt and sanitising the user prompt sometimes overlap. To better understand these methods and their similarities, read our blogs.
What is the primary goal of prompt injection defence?
Safeguarding against malicious user prompts is called prompt injection defence. The primary goal is to design the system so the output has no critical implications. It is essential to clearly define what a critical implication means to your project, as it will differ from one project to another. For example, a chatbot that replies to general public knowledge questions would be evaluated on different standards than a chatbot with access to your client’s private information via API calls to your database.
For more information regarding the power of LLMs and how they can be used within either your internal or client-facing applications, contact Praelexis AI. We are experienced in designing, evaluating, and deploying such LLM-powered applications and would love to be part of your generative AI journey.
* Content reworked from Bit.io, OpenAI, and Nvidia
** Illustrations in the feature image generated by AI
Internal peer review: Matthew Tam
Written with: Aletta Simpson
Unlock the power of AI-driven solutions and make your business future-fit today! Contact us now to discover how cutting-edge large language models can elevate your company's performance: