top of page
uintent company logo

AI & UXR, TOKEN, LLM

Why AI Sometimes Can’t Count to 3 – And What That Has to Do With Tokens

3

MIN

Sep 25, 2025

When AI miscounts

‘How many rs are there in strawberry?’ – a simple question, right?

Not for language models. For a long time, the common AI answer was ‘2’. Anyone who counts along quickly realises that this is wrong. Strawberry contains three rs – clearly.

 

It's a mistake that seems so absurdly simple that it makes you wonder: how can a highly developed language model like ChatGPT fail at this?

The answer takes us to the heart of language model architecture – more specifically, to the world of tokenisation. And that's more fascinating than it seems at first glance.


What is actually happening here?

Language models such as ChatGPT do not really count. Nor do they analyse letters, at least not in the way we would.

Instead, they break down text into so-called tokens – smaller units with which the model was trained. Tokens can be an entire word, part of a word or even just a syllable.

 

And here's the crux of the matter: ‘r’ is not a token.

The model does not “see” the letter individually, but embedded in larger text segments. It only recognises the ‘r’ if explicitly prompted to do so – and even then, it may be wrong depending on the prompt and context.

 

This is because language models do not work deterministically, but probabilistically: they guess what is probably meant – and not necessarily what would be mathematically correct.


Tokenisation using the example of ‘strawberry’

The word strawberry, for example, is broken down into exactly two tokens by GPT-4o:

[“straw”, ‘berry’]

This means that the model recognises strawberry as two typical word components. And the ‘r’? It is contained in both tokens – but never in isolation. The language model does not count letters, but probability-based clusters of meaning.


So anyone who asks how many rs there are in strawberry is asking an accounting question of a semantic probability model. No wonder it often got it wrong in the past.


Even more exciting: German words

German language, difficult tokens: our beloved compound words are a real stress test for tokenisers. But surprisingly, GPT-4o doesn't do too badly here:


Word

Token

Number

Herausforderung  (Challenge)

["Hera", "us", "ford", "er", "ung"] 

5

Krankenhausaufenthalt  (Hospital stay)

["Kranken", "haus", "auf", "ent", "halt"]

5

Datenschutzgrundverordnung  (General Data Protection Regulation)

["Datenschutz", "grund", "ver", "ord", "nung"] 

5

Arbeitszeiterfassungspflicht (Obligation to record working hours)

["Arbeits", "zeit", "er", "fass", "ungs", "pflicht"] 

6

Selbstverständlichkeit  (Matter of course)

["Selbst", "ver", "ständ", "lich", "keit"]

5


This shows that the tokeniser recognises many meaningful units – such as ‘-ity’, ‘-ity’, “self”, ‘evident’ – and splits compounds in a semantically intelligent way.


But here, too, the same applies: no model counts letters. It recognises, processes and combines tokens. Only if it is additionally trained or prepared for this with an example can it count correctly.


Why this is important (also for UX & prompting)

These small counting errors tell a big story – about the nature of language models.

 

Language models do not calculate; they probabilistically rank language. They can be brilliant at leaps in meaning and nuances, but they can also be completely off the mark when it comes to simple structural questions – such as counting letters, alphabetical sorting or mathematical sequences.

 

When it comes to prompting, this means:

  • If precision is important (e.g. for counting, formatting, extraction) → formulate tasks very clearly

  • If hallucinations are to be avoided → provide examples

  • If UX researchers work with AI → keep the behaviour of the tokeniser in mind


Because many supposed ‘errors’ are actually consequences of the architecture. And once you understand that, you can write much better prompts – and get better results.


Conclusion: Tokens are the new semantics – or the new tripwire

The error with the “r” in strawberry is not a trivial bug – it is an invitation to better understand language models.

 

Anyone who works with AI should be aware that:

  • AI does not understand letters – it understands tokens.

  • AI does not count – it estimates probabilities.

  • AI is not stupid – it is just trained differently.

 

Those who know this are less likely to stumble over simple tasks – and get more out of complex prompts.


Bonus: Try it yourself

🔧 Tool tip

 If you want to try for yourself how words are broken down into tokens, you can use this OpenAI tool, for example:

  

👉 https://platform.openai.com/tokenizer 


🧠 Prompt tip for counting

‘Please count exactly how many times the letter “r” appears in the following word: strawberry. Just give me the number.’


📣 Participatory question

 How many s are there in ‘Mississippi’?

 

💌 Not enough? Then read on – in our newsletter. It comes four times a year. Sticks in your mind longer. To subscribe: https://www.uintent.com/newsletter

A glowing golden trophy floats above a gap, while small figures below work on user research and wireframes, untouched by its light.

Understanding UX AI Benchmarks: What HLE and METR Really Tell Us About AI Tools

AI & UXR

Futuristic digital illustration on a deep navy background: a human hand holding a warm glowing pencil and a cyan-lit robotic hand both reach toward a radiant central data cluster. Surrounded by stacked documents and a network of connected nodes, the scene symbolizes collaboration between human interpretation and digital information processing.

NotebookLM in UX Research: An Honest Assessment of a Specialized AI Tool

AI & UXR, HOW-TO, LLM

Futuristic glowing cylinder divided into segments by golden barriers.

Introducing Gated Salami Prompting: Why You Should Slice Complex LLM Tasks Into Smaller Pieces

CHAT GPT, HOW-TO, LLM, PROMPTS

Futuristic square illustration on deep navy background: a glowing golden speech bubble dissolves into particles that partially reassemble incorrectly, surrounded by energy arcs, luminous nodes, and a stylized digital head—symbolizing LLM hallucinations.

Fictitious Quotes, Lost Nuances: The Hallucination Problem in Qualitative Analysis With Llms

CHAT GPT, HOW-TO, LLM, OPEN AI, PROMPTS, TOKEN, UX METHODS

Surreal futuristic illustration of a glowing digital head with data streams, charts, and evaluation symbols representing AI evaluation methodology.

How do we know that our prompt is doing a good job? Why UX research needs an evaluation methodology for AI-based analysis

AI WRITING, DIGITISATION, HOW-TO, PROMPTS

A surreal, futuristic illustration featuring a translucent human profile with a glowing brain connected by flowing data streams to a hovering, golden crystal.

Prompt Psychology Exposed: Why “Tipping” ChatGPT Sometimes Works

CHAT GPT, HOW-TO, LLM, UX

Surreal, futuristic illustration of a person seen from behind standing in a glowing digital cityscape.

System Prompts in UX Research: What You Need to Know About Invisible AI Control

PROMPTS, RESEARCH, UX, UX INSIGHTS

Abstract futuristic illustration of a person, various videos, and notes.

Summarizing YouTube Videos With AI: Three Tools Put to the Test in UX Research

LLM, UX, HOW-TO

two folded hands holding a growing plant

UX For a Better World: We Are Giving Away a UX Research Project to Non-profit Organisations and Sustainable Companies!

UX INSIGHTS, UX FOR GOOD, TRENDS, RESEARCH

Abstract futuristic illustration of a person facing a glowing tower of documents and flowing data streams.

AI Tools UX Research: How Do These Tools Handle Large Documents?

LLM, CHAT GPT, HOW-TO

Illustration of Donald Trump with raised hand in front of an abstract digital background suggesting speech bubbles and data structures.

Donald Trump Prompt: How Provocative AI Prompts Affect UX Budgets

AI & UXR, PROMPTS, STAKEHOLDER MANAGEMENT

Driver's point of view looking at a winding country road surrounded by green vegetation. The steering wheel, dashboard and rear-view mirror are visible in the foreground.

The Final Hurdle: How Unsafe Automation Undermines Trust in Adas

AUTOMATION, AUTOMOTIVE UX, AUTONOMOUS DRIVING, GAMIFICATION, TRENDS

Illustration of a person standing at a fork in the road with two equal paths.

Will AI Replace UX Jobs? What a Study of 200,000 AI Conversations Really Shows

HUMAN VS AI, RESEARCH, AI & UXR

Close-up of a premium tweeter speaker in a car dashboard with perforated metal surface.

The Passenger Who Always Listens: Why We Are Reluctant to Trust Our Cars When They Talk

AUTOMOTIVE UX, VOICE ASSISTANTS

Keyhole in a dark surface revealing an abstract, colorful UX research interface.

Evaluating AI Results in UX Research: How to Navigate the Black Box

AI & UXR, HOW-TO, HUMAN VS AI

A car cockpit manufactured by Audi. It features a digital display and numerous buttons on the steering wheel.

Haptic Certainty vs. Digital Temptation: The Battle for the Best Controls in Cars

AUTOMOTIVE UX, AUTONOMOUS DRIVING, CONNECTIVITY, GAMIFICATION

Digital illustration of a classical building facade with columns, supported by visible scaffolding, symbolising a fragile, purely superficial front.

UX & AI: How "UX Potemkin" Undermines Your Research and Design Decisions

AI & UXR, HUMAN VS AI, LLM, UX

Silhouette of a diver descending into deep blue water – a metaphor for in-depth research.

Deep Research AI | How to use ChatGPT effectively for UX work

CHAT GPT, HOW-TO, RESEARCH, AI & UXR

A referee holds up a scorecard labeled “Yupp.ai” between two stylized AI chatbots in a boxing ring – a symbolic image for fair user-based comparison of AI models.

How Yupp Uses Feedback to Fairly Evaluate AI Models – And What UX Professionals Can Learn From It

AI & UXR, CHAT GPT, HUMAN VS AI, LLM

A brown book entitled ‘Don't Make Me Think’ by Steve Krug lies on a small table. Light shines through the window.

Why UX Research Is Losing Credibility - And How We Can Regain It

UX, UX QUALITY, UX METHODS

 RELATED ARTICLES YOU MIGHT ENJOY 

AUTHOR

Tara Bosenick

Tara has been active as a UX specialist since 1999 and has helped to establish and shape the industry in Germany on the agency side. She specialises in the development of new UX methods, the quantification of UX and the introduction of UX in companies.


At the same time, she has always been interested in developing a corporate culture in her companies that is as ‘cool’ as possible, in which fun, performance, team spirit and customer success are interlinked. She has therefore been supporting managers and companies on the path to more New Work / agility and a better employee experience for several years.


She is one of the leading voices in the UX, CX and Employee Experience industry.

bottom of page