Speak to me: How many words a model is reading | by Salvatore Raieli | Jul, 2023


| ARTIFICIAL INTELLIGENCE| LLM| NLP |

Why and how to overcome the inner limit of a Large Language Model

20 min read

10 hours ago

LLM context window
Photo by C D-X on Unsplash

LLMs have shown their skills in recent months, demonstrating that they are proficient in a wide variety of tasks. All this through one mode of interaction: prompting.

In recent months there has been a rush to broaden the context of language models. But how does this affect a language model?

This article is divided into different sections, for each section we will answer these questions:

  • What is a prompt and how to build a good prompt?
  • What is the context window? How long it can be? What is limiting the length of the input sequence of a model? Why this is important?
  • How we can overcome these limitations?
  • Do the models use the long context window?
LLM context window
Photo by Jamie Templeton on Unsplash

What is a prompt and what is a good prompt?

Simply put, a prompt is how one interacts with a large language model (LLM). Given an LLM, we can interact by providing instructions in text form. This textual prompt contains the information the model needs to process a response. The prompt can contain a question, task description, content, and lots of other information. Essentially, through the prompt we provide the model with what our intent is and what we expect it to respond to.

The prompt can drastically change the behavior of the model. For example, asking the model “describe the history of France” is different from asking it “describe the history of France in three sentences” or “describe the history of France in rap form.”

In order to get adequate information from the model, it is advisable to write a good prompt. In general, a good prompt should contain either a question or a set of instructions. In addition, there could be a context (question + context). For example, we could ask the model to…



Source link

Leave a Comment