What is a Large Language Model (LLM)?

With the rise in popularity of applications such as ChatGPT it has become important to understand exactly what a Large Language Model is and, if you’re in business, how to utilise an LLM as a tool. This article looks at what an LLM is, when a question is “LLMable” and how it works exactly.

What is an LLM?

4-Mar-22-2024-10-20-17-6872-AM

What is a Large Language Model (LLM)? This is the ideal question to ask an LLM. I asked ChatGPT (an example of an LLM chatbot) to answer this question for me and this is the response I received:

“A large language model is an AI system trained on vast text data to understand and generate human language. It's capable of various tasks like translation, summarisation, and question answering.”

Finally, I turned to an academic article by Chang and Wang et al. (2024) who define an LLM as follows:

“Language models (LMs) are computational models that can understand and generate human language. LMs have the transformative ability to predict the likelihood of word sequences or generate new text based on a given input. N-gram models, the most common type of LM, estimate word probabilities based on the preceding context. However, LMs also face challenges, such as the issue of rare or unseen words, the problem of overfitting, and the difficulty in capturing complex linguistic phenomena. [...] Large Language Models (LLMs) are advanced language models with massive parameter sizes and exceptional learning capabilities.”

In summary, here are a few characteristics of an LLM:

It is an AI model.
It is trained on a vast amount of data.
It can “understand” a natural language prompt.
It can produce a natural language response.
It uses probability to be able to “understand” prompts and create responses.

It is important to note that LLMs are not very good at creating original text in the sense that they can only reproduce content if they have been trained on similar content.

When is a question “LLMable”?

In an attempt to find a simpler definition, I turned to Urban Dictionary. Unfortunately, they do not have an entry for LLMs, but they do, however, have an entry for “LLMable”:

“The ability to utilise Large Language Models (LLMs). Reasoning questions are not LLMable, questions about how many cats can fit in a school bus are LLMable.”

Sometimes an LLM hallucinates (produces an incorrect response). One of the ways to avoid hallucinations is to only ask the LLM questions that it can answer. Not all questions are “LLMable” and unfortunately, the model values answering above answering correctly. That means that if it is not able to answer, it will hallucinate.

What, then, are “LLMable” questions? LLMs are good with text-based questions. Andrey Kudryavets’s LinkedIn article explains this simply. You could use an LLM if you want to:

Shorten a piece of text or summarise it.
Have the spelling and grammar of a text checked.
Change the tone of a piece of writing.
Find synonyms, antonyms or the right words to explain something.
Extract information from a piece of text.
Manipulate a piece of text in any way.

How does an LLM work exactly?

How does the LLM know how to write? An LLM makes use of probability. Essentially, it is predicting what would be the most probable word to follow. Even though, intuitively, as a user, it might feel like the LLM “understands” what you are telling it to do. In reality, however, it does not understand, it just successfully predicts what you need.

In conclusion…

The use of LLMs and LLM chatbots like ChatGPT has gained popularity at a rapid rate. This raises the questions: What is an LLM, how does it work, and what types of questions can it answer? Ultimately, an LLM is an AI model that is trained on vast amounts of data, can be prompted in natural languages, generates responses in natural language and uses prediction to do so. It is good at generating and manipulating text and it is the ideal tool for such activities as summarising, proofreading and helping you find the right words. More and more businesses are utilising LLM chatbots as internal or external tools. Praelexis is the ideal partner for your business's LLM journey. Contact us today for a consultation regarding how LLMs can be utilised in your business.

Bibliography:

Chang, Y., Wang, X., Wang, J., Wu, Y., Yang, L., Zhu, K., Chen, H., Yi, X., Wang, C., Wang, Y. and Ye, W., 2023. A survey on evaluation of large language models. ACM Transactions on Intelligent Systems and Technology.

Kudryavets, A. 2023. Large Language Models: what are they good and not good for? https://www.linkedin.com/pulse/large-language-models-what-good-andrey-kudryavets/

What is a Large Language Model (LLM)?

What is an LLM?

When is a question “LLMable”?

How does an LLM work exactly?

In conclusion…

Bibliography:

Considering implementing an LLM? Why Praelexis should help.

Why your business should implement a Large Language Model (LLM)

Large Language Models: How to Sanitise User Input

Subscribe to our blog

Stellenbosch

South Africa

Munich

Germany

What is a Large Language Model (LLM)?

What is an LLM?

When is a question “LLMable”?

How does an LLM work exactly?

In conclusion…

Bibliography:

Related Articles

Considering implementing an LLM? Why Praelexis should help.

Why your business should implement a Large Language Model (LLM)

Large Language Models: How to Sanitise User Input

Subscribe to our blog

Collaborate with us!

Stellenbosch

South Africa

Munich

Germany