top of page
uintent company logo

AI & UXR

Everything You Need to Know About Tokens, Data Volumes and Processing in ChatGPT


4

MIN

Nov 26, 2024

Introduction to tokens and processable data sets 

When working with ChatGPT, you will quickly come across a central concept: Tokens. But what are tokens and why are they important? Tokens are the smallest units of information that the model can process - they can be whole words, parts of words or even punctuation marks. The length of a token varies depending on the language and context, but on average you can say that a token is about 4 characters or 0.75 words.


Why is this relevant? Because the maximum number of tokens that ChatGPT can process in a conversation or analysis is crucial to how much information can fit through the model at once. Currently, the token limit is 8192 tokens. This means that the entire content - both your questions or data and the answers that ChatGPT generates - must not exceed this limit. This token limit naturally affects how long a single conversation can be before older parts of the conversation are "forgotten".


For comparison: 8192 tokens correspond to around 16 to 20 book pages of an average book, with one book page containing between 250 and 300 words. This calculation shows you that the model can process quite a lot of information in one go - but with long texts or complex data, this limit can also be reached quickly.


Dealing with large amounts of data in ChatGPT 

Suppose you want to analyse an entire chapter of a book - no problem! But what happens if the chapter is longer than 8192 tokens? In such cases, ChatGPT cannot process the data completely in one go. A common assumption is that the model then simply splits the data into digestible sections on its own - but this does not happen automatically. You have to manually divide the data into smaller sections and control the data flow.


Here are a few tips on how best to do this:

  • Segment your text into thematically meaningful sections: Instead of sending everything at once, divide the text into smaller chunks that are cohesive and easier to digest.

  • Link the sections together: To ensure that the context is not lost, briefly summarise what has been discussed so far at the beginning of a new section. This helps to maintain context.

  • Identify key information: If you know that certain parts of the text are more important than others, focus on them first. This allows you to use the token limit more efficiently.


Strategies for optimising the use of the token limit 

  • Focus on important data: To use the tokens efficiently, you should identify the most important points before you send the text. This will save you space and you will quickly receive answers to the really relevant topics.

  • Summarise where possible: If you have a huge amount of data, summarise the text to a minimum. The aim is to pack as much as possible into the token limit without losing context.

  • Iterative processing: If all the context is important but the amount of data is too large, process the information iteratively. This means: submit the data in parts and briefly summarise the most important information after each section so that the overall context is retained.


Time dependency of processing 

You may be asking yourself: ‘What happens if I take a long break in a chat? Does ChatGPT then forget everything?’ The good news is that processing is not time-dependent. Regardless of whether you reply within minutes or after hours or even days - as long as the chat remains open and the token limit is not reached, the context is retained.


This means that long pauses do not affect the chat. Nevertheless, with very long chats it can happen that previous information is ‘forgotten’. Why? Because the token limit also applies to the entire chat history. If the limit of 8192 tokens is reached, a so-called ‘memory loss’ is applied: Older parts of the conversation are removed to make room for new content. It therefore makes sense to regularly summarise the chat or repeat important points.


Another detail: If you are processing large amounts of data in smaller sections, it is helpful to always clearly label how the sections relate to each other. This helps ChatGPT to understand the context and process the data correctly.


Feedback at token limit 

An important point you should know: As soon as the token limit is reached, ChatGPT will let you know. This happens so that you are informed in good time and the context is not lost unexpectedly. You then have the option of summarising parts of the conversation, removing irrelevant information, or taking other measures so that the conversation can continue efficiently.


Practical tips and best practices 

To get the most out of ChatGPT, it is helpful to focus on the context and relevance of the information. The accuracy and precision of the data you send to ChatGPT will have a direct impact on the quality of the analysis. It is therefore worth preparing the data well before sharing it in the chat.


If you are working with particularly large amounts of data, it may be useful to use external tools to analyse, shorten or summarise data before sending it to ChatGPT. In this way, you can make optimum use of the space in the token limit.


For long chats, it's always a good idea to regularly repeat key points or create summaries. This keeps the context clear and ensures that ChatGPT maintains an overview.


In case you're wondering, there is no hard and fast rule for token consumption per message. Sometimes a simple question may only consume a few tokens, while a complex question or long answer may require several hundred tokens. The important thing is simply to keep an overview so that the token limit is not reached too early.


Outlook for future developments 

Of course, it would be nice if we never reached the token limit. In fact, there are already plans to increase the amount of data that can be processed in future versions of ChatGPT. The current limit is 8192 tokens, but there are already variants such as ‘GPT-4-32k’ that can process up to 32,000 tokens. This is four times the current limit and makes it possible to process even longer texts or more complex chats in one go.


Technical statistics and details of this chat 

By the way: This text is about 1,600 tokens long. And the chat I used to develop this post used about 1,000 tokens. And another tip: A good tool for ‘token counting’ is https://platform.openai.com/tokenizer. Sometimes even ChatGPT doesn't manage this.

A cozy Christmas table with a laptop, gifts, a cup of cocoa, and festive decorations, showcasing creativity and humor for the holidays.

The ‘Christmas Prompts’ - Practical & Fun Ideas for the Festive Season

AI & UXR, TRENDS

A futuristic humanoid figure made of glowing digital code fades into a neural network background, symbolizing AI and consciousness.

Hollywood’s as AIs vs ChatGPT: What Film AIs Have In Common With ChatGPT (And What They Don’t)

AI & UXR, CHAT GPT, HUMAN VS AI

Medieval image of a scholar with a scroll, surrounded by floating symbols representing errors and hallucinations.

Calculating With AI: A Story of Mistakes and Coincidences.

AI & UXR, OPEN AI, HUMAN VS AI

A dark, satanic-themed image featuring a menacing devil's head with horns, surrounded by gothic and occult symbols, including pentagrams and flames. The phrase 'The Devil is in the Details' appears in bold gothic font in the center, with red and black colors dominating the background.

Everything You Need to Know About Tokens, Data Volumes and Processing in ChatGPT

AI & UXR

Colourful image with typewriter, speech bubbles and pen representing different writing styles; background with fonts in varying typefaces for style diversity.

Your Own Writing Style and ChatGPT: A Guide to Proper Communication

AI & UXR

A women yells at a robot.

Being Nice Helps - Not Only With People, but Also With AI

AI & UXR

A face split down the middle with the left half being a robot and the right half a woman.

Male, Female, Neutral? On a Journey of Discovery With an AI - Of ‘Neutrality’ and Gender Roles

AI & UXR

A floating robot between many symbols of the English and German language.

German or English? How the Choice of Language Influences the Quality of AI Answers

AI & UXR

Image of a podcast cover on the topic of quality in UX research with two women on the cover.

Podcast: Why the quality of UX research can sometimes be a challenge

UX, UX INSIGHTS, UX QUALITY

Two people sitting at a table in a office in front of a laptop and discussing

Why User Research Is Essential: The Most Common Objections and How to Refute Them

UX INSIGHTS, STAKEHOLDER MANAGEMENT, OBJECTION HANDLING, ADVANTAGES USER RESEARCH

Market square in a typical German town with fountain and half-timbered houses

10 Tips for UX Research in Germany

HOW-TO, UX INSIGHTS, BEST PRACTICES, UX IN GERMANY

plastic toy car with toy passengers

In-Car-Gamification: What Do We Actually Do When the Car Drives Itself?

AUTOMOTIVE UX, AUTONOMOUS DRIVING, GAMIFICATION

Woman using her phone while driving

The Future of My Car: My Smartphone on Wheels

AUTOMOTIVE UX, CONNECTIVITY, DIGITISATION

Different medical devices & medications in packaging

Accessibility in Healthcare: Underestimated Pitfalls With Packaging & Labelling

HEALTHCARE, ACCESSIBILITY, REGULATIONS

man driving a car at sunset

Digital Companion or Minimalist Assistant - How Much Is Too Much for German Drivers?

AUTOMOTIVE UX, VOICE ASSISTANTS, TRENDS

different video game controllers

Inside the Game: Exploring the State of Gaming UX - Where Fun Meets Function!

UX INSIGHTS, EXPERT INTERVIEWS, GAMING UX

A skyline of Sydney

A Deep Dive into UX Testing: Australia's Unique Role and Insights Revealed

UX INSIGHTS, EXPERT INTERVIEWS, UX IN AUSTRALIA

Elder adult using video call to consult doctor

The Importance of Patient-Centricity in the Development of Digital Health Applications

MHEALTH, HUMAN-CENTERED DESIGN

AI generated Graphic of an Android using a laptop

AI in User Research: A Case Study on Analyzing International Surveys using ChatGPT

AI & UXR, HUMAN VS AI, HOW-TO

A modern car from the outside

Revolutionizing Vehicle Interactions: Exploring the Future of HMIs at the 2023 car.HMI Conference

HMI, CAR.HMI CONFERENCE, TRENDS

 RELATED ARTICLES YOU MIGHT ENJOY 

AUTHOR

Tara Bosenick

Tara has been active as a UX specialist since 1999 and has helped to establish and shape the industry in Germany on the agency side. She specialises in the development of new UX methods, the quantification of UX and the introduction of UX in companies.


At the same time, she has always been interested in developing a corporate culture in her companies that is as ‘cool’ as possible, in which fun, performance, team spirit and customer success are interlinked. She has therefore been supporting managers and companies on the path to more New Work / agility and a better employee experience for several years.


She is one of the leading voices in the UX, CX and Employee Experience industry.

bottom of page