lauralikespi
- Aug 14, 2023
- 11 min read

ChatGPT Take 7 - How the Experts Do It

Last week I was extremely lucky (or terminally online) to get a ticket to an interview in London on ChatGPT. The big draw to the event at UCL was Mike Heaton, a Member of Technical Staff at OpenAI (previously Machine Learning at Stripe). The interview was hosted by Peter Nixey who does many things - builds businesses, runs courses on LLMs, organises talks - but who I would definitely call a ChatGPT expert.

Unfortunately, but not unexpectedly, we were asked for the interview to be off the record so I won’t be writing about it. However, before the interview Peter sent attendees an email with some homework - two prompts to send to ChatGPT. Like the delightful nerd I am, I did the homework. I wrote notes. And as per my usual style, I am now super excited to be telling people about it (this is a joke, in case you don’t know me - every time I learn or hear about anything I get excited like a golden retriever and just want everyone to learn about it as well. Thank god for blogging).

Therefore this week’s topic will be around how an expert (or a power user) uses ChatGPT (and how that is very different from how you or I may think about using these tools).

In case you’ve missed my previous ramblings:

My First Foray with ChatGPT - am I doing this right? - 15th May 2023

ChatGPT Take 2 - my new website assistant - 24th May 2023

ChatGPT Take 3 - Write More Like Me - 1st Jun 2023

ChatGPT Take 4 - Code for Me - 22nd Jun 2023

ChatGPT Take 5 - Let's Get Down to Business - 20th Jul 2023

ChatGPT Take 6 - Will You Be My Friend? - 1st Aug 2023

Homework - Ask the SuperBrain

You know the drill by now - we always ask the SuperBrain how best to use it. But how about asking the SuperBrain how it works?

In an email titled How to get the most out of today's OpenAI interview, the organiser of the event sent through two prompts to be “dropped into” ChatGPT. The aim of this homework, or pre-work really, was to make sure people don’t spend time asking “basic” questions and that the topic of conversation (which was high level but technical) could be understood.

Let’s take a small step back here, up to now in this series I’ve been avoiding using words like prompt or context window but they are simple to understand and very useful if you want to get the most out of these tools.

L: What is a prompt?

ChatGPT: A prompt is a specific instruction or request given to a person or a computer program to elicit a particular response or action. In the context of human-computer interaction or natural language processing, a prompt is a piece of text that serves as an input to a language model, like the GPT-3.5 model that powers me, to generate a coherent and contextually relevant output.

When interacting with a language model, you provide a prompt to instruct the model on what kind of information or response you are seeking. The model then uses its training data and patterns it has learned to generate a coherent and contextually relevant response based on the input prompt.

For example, if you were to ask me, "Can you explain the concept of supply and demand?", that question would serve as a prompt for me to provide an explanation of the supply and demand concept.

In essence, a prompt is the starting point of a conversation or an instruction that guides the generation of text or responses by a language model.

So a prompt in terms of language models is very similar to how we use the word in English. “Jack gave Jill a prompt to start walking down the hill” It can be thought of as an instruction. Even though it feels like we are having a conversation with ChatGPT, we are actually always instructing it to continue responding to us.

As we call talking to ChatGPT and other language models prompting, people who are doing this well or as a job are now being called (or calling themselves) prompt engineers.

Prompt 1 - The context window

The prompt provided for me to give ChatGPT:

L: I'd like to understand the concept of the context window in an LLM and in particular how it relates to what I should think about during an extended chat conversation. I'd like you to teach me what I need to know but to do so through the medium of short form Q&A with a bit of hinting along the way. Don't just lecture me but ask a series of questions that help me understand what I need to know. Assume I don’t know anything about tokens either - for the sake of simplicity, just pretend that the window is measured in words and that one word equals one token. If I ask at the end then tell me a bit more about what tokens are. Also gently introduce me to the idea of how the context window represents both the input and the output. But don’t make any assumptions about what I know already. Make sure to explain any terminology.

Before we jump into what a context window is, or how ChatGPT responded, let’s take a second to analyse this prompt. The first thing which jumped out to me was just the length of the prompt. Mine are using one or two sentences. A bit longer when I follow instructions given to me by the SuperBrain (I probably should have realised this was a key aspect).

The prompt begins with a clear goal - “I'd like to understand the concept of the context window in an LLM”. With an added specific detail, “in particular how it relates to what I should think about during an extended chat conversation”.

Next, the expert prompter included the how for ChatGPT to respond, “I'd like you to teach me what I need to know but to do so through the medium of short form Q&A with a bit of hinting along the way”. I found this an interesting choice of wording - “teach”, “Q&A”, “hinting”. Again, a bit more specificity was added, “Don't just lecture me but ask a series of questions that help me understand what I need to know.” This request for how ChatGPT would respond to the prompt feels also the opposite of how ChatGPT tends to respond (a random lecture with lots of lists, and very few questions).

The next sentence was to explain the knowledge of the audience of the reply, “Assume I don’t know anything about tokens either”. Again adding a specific command to this part of the prompt, “for the sake of simplicity, just pretend that the window is measured in words and that one word equals one token.” [don’t worry we’re going to explain tokens in a second]

The final sentences of the prompt are specific behaviour for ChatGPT to follow:

“If I ask at the end then tell me a bit more about what tokens are.”
“Also gently introduce me to the idea of how the context window represents both the input and the output.”
But don’t make any assumptions about what I know already.
“Make sure to explain any terminology.”

The first of these sentences tells ChatGPT what to do if the prompter asks something specific. The next tells ChatGPT how to get a specific topic across (particularly interesting is the use of the word “gently”). The final two sentences are reinforcements of the instruction to assume the prompter has no knowledge.

Wow.

And the SuperBrain responded:

ChatGPT: Sure, I'll guide you through a series of questions to help you understand the concept of the context window in an LLM (Language Model) and its relation to extended chat conversations.

1. Have you heard about language models before, and do you know what their main purpose is?

2. In the context of language models, what do you think the term "context" refers to?

3. Imagine you're having a conversation with someone and they only consider the last few words you said. How might that affect the flow and coherence of the conversation?

4. So, what do you think the context window does in an LLM, and how might it impact the model's understanding of a conversation?

5. When the model processes a sequence of words in the context window, does it only consider the input text, or does it also take the output into account? Why would that be important during an extended chat conversation?

Hint: The context window allows the model to see a limited number of words, and it can affect both input understanding and output generation.

6. In an extended chat conversation, if the context window is too small, what could happen?

7. Conversely, what if the context window is too large?

8. How could the size of the context window be optimized to strike a balance between understanding the conversation and generating relevant responses?

Hint: Consider the trade-off between context size and computational resources.

9. Now, let's talk about tokens. In the context of language models, what are tokens?

Hint: Tokens are the basic units that the language model uses to process and understand text.

10. Why is it essential to understand tokens when working with language models like LLMs?

11. How might the concept of tokens relate to the size of the context window in terms of computational efficiency?

12. Finally, based on what you've learned about the context window and tokens, how might an LLM handle an extended chat conversation efficiently and effectively?

Feel free to answer the questions, and I'll provide feedback and explanations along the way.

Again, wow. This is like a supercharged SuperBrain response. The response really feels like ChatGPT is trying to teach me, by guiding me (giving hints) and also is asking me questions.

Tokens and Context Windows

L: Are tokens words?

ChatGPT: Yes, tokens are often associated with words, but they can represent smaller or larger units of text, depending on the language model's configuration. In a basic setting, a token can indeed correspond to a word in the text. However, it's essential to note that tokens can also represent subwords or even characters, especially in models that use subword tokenization like BPE (Byte Pair Encoding) or SentencePiece.

So, in summary, while tokens can be words, they can also be subwords or characters, depending on how the language model processes and encodes the text.

Notice, I have gone back to my natural short worded prompts. And while ChatGPT is still behaving differently than my usual interactions, but the tone feels different (I can’t tell if I’m making this up because I know what the original prompt was).

I then tried to ask some questions about the context window (as was suggested by ChatGPT’s response). I’m still using ChatGPT 3.5 (I know, I know) and I believe that might have impacted this response - but, similar to when I asked ChatGPT to help with coding in a previous blog, ChatGPT gave me some incorrect answers. The SuperBrain kept insisting it only remembered a few replies. But I know ChatGPT remembers the entire chat window. Eventually we got to:

ChatGPT: Apologies for the confusion in my previous responses. You are correct. As an AI language model, I do have memory of the conversation's immediate context, including information from previous replies within the same interaction. This allows me to maintain coherence and continuity in our conversation and provide relevant responses based on the current discussion.

However, once the conversation ends, I don't retain any information about the specific interaction, and subsequent interactions start fresh without any knowledge of past conversations. My memory is limited to the current context window within the ongoing conversation, and I don't have access to any external data or information beyond the scope of this specific chat session.

Let’s just check in that everyone is understanding the conversation:

Tokens are the basic unit of language in an LLM (usually the smallest). This can be words, phrases, sentences or even parts of words.
Context window is the number of tokens an LLM uses to generate responses (for ChatGPT 3.5 this is 2,000 and for ChatGPT4 it is 32k!).

More Tokens Please

Going back to ChatGPT’s initial response, I tried to dig into understanding context windows a little bit more:

L: How might the concept of tokens relate to the size of the context window in terms of computational efficiency?

ChatGPT gave a long answer (which I found easy to understand), which ended with:

ChatGPT: Optimizing the size of the context window is crucial for balancing computational efficiency and the model's ability to generate meaningful responses. A suitable context window size should be chosen to strike the right balance between understanding the conversation and keeping computational resources in check.

The relationship between context window and tokens is the larger the context window the more tokens. Think of this like a conversation - the longer the conversation you remember, the more words and sentences you have to remember.

Having a larger context window means more meaningful responses can be given, but there are more tokens so it will take more computation power. On the other hand, a smaller context window will require less computation power, but the conversation will not have access to as much information, potentially leading to worse conversations.

To me it logically made sense that smaller tokens meant the algorithm would be slower, but I was wrong:

L: So do smaller tokens mean less computational efficiency?

ChatGPT: In general, smaller tokens can lead to better computational efficiency in language models. When tokens are smaller, the model deals with more granular units of text, which can reduce the memory footprint and processing time compared to larger tokens.

For example, consider a sentence with smaller tokens like "I love AI." The model may tokenize it as ['I', 'love', 'AI', '.'], where each token represents a single word or even a subword. In contrast, larger tokens might represent entire phrases or even sentences, leading to a higher computational burden.

Smaller tokens are also advantageous during training and inference because they allow the model to process and learn from more fine-grained linguistic information. This can lead to better generalization and understanding of the language.

However, it's important to note that the tokenization process can be a trade-off between granularity and vocabulary size. Smaller tokens might result in a larger vocabulary, which can increase memory consumption. Nonetheless, when it comes to computational efficiency, smaller tokens are generally preferred as they enable faster processing and better utilization of resources.

L: Does an LLM use smaller tokens and bigger context window (eg one interaction) to handle an extended chat conversation efficiently and effectively?

ChatGPT: Yes, using smaller tokens and a bigger context window (e.g., one interaction) is a common approach for LLMs to handle extended chat conversations efficiently and effectively.

Smaller tokens provide more fine-grained linguistic information, allowing the model to understand the text at a more detailed level. This can lead to better comprehension and more accurate responses during an extended chat conversation.

On the other hand, a bigger context window enables the model to consider a longer history of the conversation, providing more context for generating coherent responses. With a larger context window, the model can take into account not just the most recent input but also previous exchanges, maintaining better continuity in the conversation.

By combining smaller tokens and a bigger context window, the LLM can strike a balance between understanding the conversation at a granular level and utilizing the context effectively. This approach allows the model to generate meaningful responses based on a rich history of the conversation without compromising computational efficiency. It helps in creating a more engaging and interactive chat experience for users.

To summarise, to allow ChatGPT to perform quickly and to get the best conversational experience, they combine smaller tokens (eg words or parts of words) and big context windows (ie the entire conversation)

Summary

Prompting Like a Pro

We’re been prompting totally wrong, and it’s a hard habit to break. We need to write longer prompts with more details (goal, audience, tone / style, tips).

ChatGPT 4 is much better than ChatGPT3.5

We now more deeply understand the context window (2,000 vs 32k tokens makes for a much better conversation). There was also a discussion in the interview about the differences between the versions - 4 has a bigger vocabulary and is better at logical reasoning, the core components of language models. I’ve been convinced to make the leap. From next blog post, I will have access to both versions!

Next Time

For those who actually paid attention, I mentioned two prompts in the homework. However, the context window (and the how to prompt) was very interesting and deserved an entire blog. Next time, I will take a look at the other prompt (about using your own database), and have a go writing my own prompt using this new method.