Large Language Models: How to Sanitise User Input

Prompt engineering is a subcategory within LLMs (Large Language Models). There are two primary approaches to prompt injection defence: (1) crafting a strategic system prompt and (2) sanitising the user prompt. It is important to consider Prompt Engineering when implementing an LLM in Business.

Large Language Model Prompt Engineering

However, since the field of LLM prompt injection is so new and ever-evolving, it is difficult to know whether one defence method alone can stop new injection methods. Consequently, it is recommended to implement the Swiss Cheese Model. This model implies that by implementing various imperfect defence methods (rather than focussing on a single one), you can strive towards a much more secure system.

Large Language Model Swiss Cheese Model

How do you go about creating a strategic system prompt?

Our blog post discussing how you can craft a strategic system prompt explains how you can delineate the task and role of the LLM through a system prompt and how you can use a system prompt to delimit the instructions that are sent to the LLM. These could also be seen as ways of using the system prompt to sanitise the user input. However, this post focuses on some of the main ways user input can be sanitised: reducing the input size of the user prompt, moderating the input and the output, and implementing the principle of Least Privilege.

1. Reducing the input size of the user prompt

When developing the backend of an LLM application, the developer can set the number of characters that the backend will accept as input. The number of characters a user may input to an LLM significantly impacts the defence against prompt injections. The more space you allow a user to input into the model, the bigger the opportunity is for the malicious user to craft an effective, prompt injection. Consequently, it is recommended that the user only provide the smallest number of input characters possible without hampering the business case or user experience too much. Although this will not stop prompt injections, it provides a thin layer of protection against malicious users.

For example, suppose the purpose of a chatbot is to perform small and simple tax calculations based on a few provided parameters. In that case, it is not necessary to allow the input size to be more than a few hundred characters. Suppose the purpose of a chatbot is summarisation. In that case, however, it will be a necessity to allow the user to input large amounts of characters (e.g. paragraphs of text) since the purpose of the LLM would be to take large amounts of information and summarise it down into only a few lines. In this instance, by reducing the character intake too much, you would risk the deterioration of the functionality of a chatbot.

2. Moderating the input and the output

Following the Swiss Cheese Model, an LLM must be defended at all levels. This includes the possibility that the malicious actor successfully infiltrated and amended the system prompt to perform some alternative action. Suppose an LLM was initially employed to perform a specific task that responds to the user in a specific structured style. In that case, you can use this as a filter to moderate the output of the LLM before feeding it back to the user. If the model provides an unexpected output (style or format), the output can be flagged and not returned to the user. This filtering can occur outside the LLM system, thereby making it immune to the changes of the prompt injection.

a. Filter the output

To elucidate this scenario, consider the case where the LLM is tasked with summarising a given text into a single paragraph. A filter can be included after the LLM to check whether the output of the LLM is only a single paragraph. If the malicious actor prompts the LLM to generate any output that is not within a single paragraph, the program will flag the output and not present it back to the user.

b. Filter the input

Similarly, you can also filter the user input message. For example, if the purpose of the LLM has nothing to do with Python code, you can filter out any Python-like code before it is fed to the LLM, protecting it from a code injection. This moderation can also be a gatekeeper, filtering out which user prompts are allowed to pass through to the LLM and which user prompts are flagged as potentially unsafe. Another way of filtering the user input is by identifying potentially harmful delimiters and removing them from the user prompt before feeding it to the LLM or rejecting the user prompt altogether.

3. Implementing the principle of Least Privilege

The principle of Least Privilege involves restricting the LLM's access and capabilities to the minimum necessary for its intended task. This security principle minimises potential risks by ensuring that the LLM only operates within specific, well-defined boundaries, reducing the chance of unintended behaviour or misuse. Additionally, access must only be provided to identifiable and authorised users. An example of this is asking users to sign in via their Google account before they can interact with the chatbot.

In a nutshell…

The two major prompt injection defence strategies are to craft a strategic system prompt and to sanitise the user input. Although some system prompt strategies could also be seen as sanitising the user input, this blog post covered some additional things developers should consider when attempting to sanitise user input. This includes reducing the size of the user prompt, moderating the input and output and implementing the principle of Least Privilege.

For more information regarding the power of LLMs and how they can be used within either your internal or client-facing applications, contact Praelexis AI. We are experienced in designing, evaluating, and deploying such LLM-powered applications and would love to be part of your generative AI journey.

* Content reworked from Bit.io, OpenAI, and Nvidia

** Illustrations in the feature image generated by AI

Internal peer review: Matthew Tam
Written with: Aletta Simpson

Unlock the power of AI-driven solutions and make your business future-fit today! Contact us now to discover how cutting-edge large language models can elevate your company's performance:

Collaborate with us!

Large Language Models: How to Sanitise User Input

How do you go about creating a strategic system prompt?