Let’s start to dive into the various kinds of generative AI subfields by starting with Large Language Models (LLMs). An LLM is (from Wikipedia):
a computerized language model consisting of an artificial neural network with many parameters (tens of millions to billions), trained on large quantities of unlabeled text using self-supervised learning or semi-supervised learning.
Though the term large language model has no formal definition, it often refers to deep learning models with millions or even billions of parameters, that have been “pre-trained” on a large corpus.
So, LLMs are Deep Learning (DL) models (aka, Neural Networks) trained with millions of parameters on a huge amount of text (this is why we call them “large”) and are useful to solve some language problems like:
- Text classification
- Question & Answering
- Document summarization
- Text generation
So, another important difference between standard ML models is that, in this case, we can train a DL algorithm that can be used for different tasks.
Let me explain better.
If we need to develop a system that can recognize dogs in images as we’ve seen before, we need to train a DL algorithm to solve a classification task that is: tell us if new, unseen images are representing dogs or not. Nothing more.
Instead, training an LLM can help us in all the tasks we’ve described above. So, this also justifies the amount of computing power (and money!) needed to train an LLM (which requires petabytes of data!).
As we know, LLMs are queried by users thanks to prompts. Now, we have to spot the difference between prompt design and prompt engineering:
- Prompt design. This is the art of creating a prompt that is specifically suitable for the specific task that the system is performing. For example, if we want to ask our LLM to translate a text from English to Italian, we have to write a specific prompt in English asking the model to translate the text we’re pasting into Italian.
- Prompt engineering. This is the process of creating prompts to improve the performance of our LLM. This means using our domain knowledge to add details to the prompt like specific keywords, specific context and examples, and the desired output if necessary.
Of course, when we’re prompting, sometimes we use a mix of both. For example, we may want a translation from English to Italian that interests a particular domain of knowledge, like mechanics.
So, for example, a prompt may be:” Translate in Italian the following:
the beam is subject to normal stress.
Consider that we’re in the field of mechanics, so ‘normal stress’ must be related to it”.
Because, you know: “normal” and “stress” may be misunderstood by the model (but even by humans!).
The three types of LLMs
There are three types of LLMs:
- Generic Language Models. These are able to predict a word (or a phrase) based on the language in the training data. Think, for example, of your email auto-completion feature to understand this type.
- Instruction Tuned Models. These kinds of models are trained to predict a response to the instructions given in the input. Summarizing a given text is a typical example.
- Dialog Tuned Models. These are trained to have a dialogue with the user, using the subsequent responses. An AI-powered chatbot is a typical example.
Anyway, consider that the models that are actually distributed have mixed features. Or, at least, they can perform actions that are typical of more than one of these types.
For example, if we think of ChatGPT we can clearly say that it:
- Can predict a response to the instructions, given an input. In fact, for example, it can summarize texts, give insights on a certain argument we provide via prompts, etc… So, it has features like an Instruction Tuned Model.
- Is trained to have a dialog with the users. And this is very clear, as it works with consequent prompts until we’re happy with its answer. So, it has also features like a Dialog Tuned Model.