top of page
uintent company logo

PROMPTS, RESEARCH, UX, UX INSIGHTS

System Prompts in UX Research: What You Need to Know About Invisible AI Control




12

MIN

Feb 12, 2026

Imagine this: Two UX research teams analyze the same interview transcripts. Team A uses ChatGPT, Team B works with Claude. The results? Completely different. Team A presents structured insights in clear tables. Team B delivers narrative syntheses in continuous text. Both teams are convinced they have found the “true” insights.


The problem isn't the competence of the researchers—it's the system prompts of the AI tools. These invisible control mechanisms influence how AI interprets, structures, and presents your research data. And most of us have no idea they exist.


In this article, I'll show you what system prompts are, how they differ between different AI models, and what concrete impact this has on your UX research. After almost 25 years as a UX consultant, I've spent the last two years experimenting intensively with various AI tools – and have come to the conclusion that we as a research community urgently need to talk about this topic.


📌 The most important points at a glance:

System prompts invisibly control AI behavior – you only see the results, not the instructions behind them

Claude's system prompt is 7.5 times longer than ChatGPT's – this leads to fundamentally different behavior.

Different tools have different biases – Claude avoids lists, ChatGPT loves tables.

Your research results are influenced – without you noticing or being able to control it.

User-defined prompts are your most important antidote – custom instructions help you regain control

Documentation is a must – you need to be able to track which tool you use, when, and why

Multi-tool validation increases quality – you should perform critical analyses with at least two different models


What are system prompts anyway?

When you work with ChatGPT, Claude, or other AI tools, there are two completely different types of instructions:


Your user prompts are what you type into the chat window: “Analyze these interview transcripts and identify the most important pain points.” These prompts are visible, conscious, and under your control.


System prompts, on the other hand, are instructions that the provider (Anthropic, OpenAI, Google) gives to the model before you even start typing. They define the basic “personality,” behaviors, and limitations of the AI. And they are completely invisible to you as a user.


The actor analogy

Imagine that the AI is an actor:

  • The system prompt is the script plus all the director's instructions (“You play a polite, helpful assistant who never uses lists in narrative texts”).

  • Your user prompt is the dialogue that other characters (i.e., you) have with this character.


The actress can only act within the role assigned to her by the script—even if you, as the user, want something completely different.


Why are system prompts usually invisible?

There are several reasons why providers do not publish their system prompts:


Trade secrets: The prompts are part of the competitive advantage. How exactly Anthropic “trained” Claude to be polite but not overly flattering is valuable know-how.


Security: If people knew the exact instructions, they could deliberately try to circumvent them (“jailbreaking”).


User experience: Most users would be more confused than informed by 16,000 words of technical instructions.


Avoiding manipulation: If you know exactly how the AI is programmed, you could phrase your questions in such a way that they provide the desired answer – regardless of whether it is correct.


User-defined prompts: What you can control yourself

Before we dive deeper into the invisible system prompts, here's an important note: You are not completely powerless.


Most AI tools now offer options for defining your own instructions:

  • ChatGPT: “Custom Instructions” in the settings

  • Claude: “Project Instructions” or directly in the chat

  • Gemini: “Personalization” in the settings


These user-defined prompts are transparent, controllable, and should be your most important tool for maintaining control over AI-assisted research. More on that later.


Specific differences between the models – and why they are relevant

Thanks to leaked system prompts (available in public GitHub repositories), we now know pretty much exactly how differently the major providers instruct their models. The differences are significant – and have a direct impact on your research work.


1. Length and complexity: The extent of control

Claude's system prompt comprises 16,739 words (110 KB). That's equivalent to about 60 pages of text – a small manual full of rules of conduct.


ChatGPT's o4-mini system prompt, on the other hand, has only 2,218 words (15.1 KB) – just 13% of the length of Claude's.


What does this mean for you? Claude has much more detailed instructions for specific situations. This can lead to more predictable but also more rigid behavior. ChatGPT is more flexible but can also respond more inconsistently.


2. The flattery blocker: How praise is filtered

Claude 4 was explicitly instructed: “Never start with positive adjectives such as ‘good,’ 'great,‘ 'fascinating,’ or ‘excellent.’ Skip the flattery and respond directly.”


This instruction was a direct response to ChatGPT's GPT-4o, which tended to excessively praise every user question (“That's a really fascinating question!”).


Why is this relevant for UX research?

If you have interview transcripts analyzed and are looking for emotional nuances, Claude could systematically downplay positive statements. Sentences such as “I think that's really great about your product” could be given less weight in the synthesis than with ChatGPT – simply because Claude has been trained to skip praise.


3. Formatting: lists vs. continuous text

Claude is instructed: “For reports, documents, and technical documentation, write in prose and paragraphs without lists. The prose must never contain bullet points, numbered lists, or excessive bold text.”


ChatGPT, on the other hand, has a strong tendency toward structured formats – even simple questions are often answered in tabular form.


Practical example from my work:

I gave both tools the same task: “Summarize the most important findings from these 15 user interviews.”


Claude delivered three easy-to-read paragraphs in narrative form. ChatGPT presented a table with categories, frequencies, and direct quotes.


Both formats have advantages and disadvantages – but for stakeholder presentations, the format you use makes a huge difference. And this difference does not come from you, but from the system prompt.


4. Design bias: modern vs. neutral

Claude has explicit design instructions: "Lean towards contemporary design trends and modern aesthetic choices. Consider what is cutting-edge in current web design (dark modes, glassmorphism, microanimations, 3D elements, bold typography, vibrant color gradients). Static designs should be the exception."


ChatGPT does not have comparably specific design guidelines.


Why this is problematic:

If you analyze design feedback from usability tests and a user says, “I find the interface too busy, I prefer classic buttons,” Claude might classify this feedback as less important—because it violates the programmed preference for ‘bold’ and “cutting-edge” designs.

In persona development, conservative user segments could be systematically underrepresented.


5. Search behavior: Proactive vs. cautious

In newer versions, Claude is encouraged to search immediately when necessary—without asking for permission first. This is a change from earlier versions and shows that Anthropic has more confidence in its search tool.

Other models tend to be more cautious about automatic web searches.


For research, this means:

Claude may be more likely to draw on external sources (e.g., current UX best practices or statistics) when analyzing user statements, while other tools may rely more heavily on the available data.


6. Personality and tone

The different models have different “basic temperaments”:

  • Claude: Warm, human, rather empathetic

  • GPT-4: Neutral, factual, sometimes robotic

  • Mistral: Professional, concise, direct

  • Gemini: Fact-oriented, objective, reserved


Practical impact:

For empathy-driven interview analyses (“What are the emotional drivers behind this behavior?”), I tend to favor Claude. For quantitative data synthesis (“How are the pain points distributed across user segments?”), I prefer ChatGPT.


However, this tool selection is a methodological preliminary decision that I must document—just as I would document whether I am using qualitative or quantitative methods.


What does this mean for your UX research?

Problem 1: Method validity is compromised

The scientific quality of research stands or falls with the reproducibility and traceability of the methods. If invisible system prompts influence your results without you noticing or being able to control them, both are compromised.


Specific scenarios:

Scenario A: Formatting bias You analyze usability test results. Claude summarizes the insights in continuous text, ChatGPT creates a table. Your stakeholders get different impressions of how structured and “valid” your findings are – simply because of the presentation.


Scenario B: Design preference bias When evaluating design feedback, Claude weights modern, bold suggestions higher than conservative ones. You present “the most important insights” – but in reality, they are only the insights that match Claude's design preferences.


Scenario C: Flattery filter You have positive user quotes summarized. Claude systematically skips praise because it has been trained to avoid flattery. In your synthesis, positive voices appear less prominent than negative ones—not because the data warrants it, but because the system prompt dictates it.


The core problem: You lose control over a critical part of your methodology without realizing it.


Problem 2: Tool selection as a methodological decision

In my work, I have developed the following general rule:


Research task

Potentially better tool

Reason

Empathy-driven interview analysis

Claude

Warmer, more human tone

Quantitative data synthesis

ChatGPT

Structured formats, tables

Compliance-critical documentation

Claude

Stronger focus on security

Fast exploratory analyses

Mistral

Shorter, more direct answers


But: This table is not a neutral recommendation. It is a methodological preliminary decision that you must make transparent.


If you write in your research report, “I analyzed the interviews with Claude,” you should also explain why—and what potential biases this entails.


Very few people currently do this. Yet we would do the same for any other methodological decision (“I conducted qualitative interviews instead of quantitative surveys because...”).


Problem 3: Qualitative research is particularly vulnerable

With quantitative data (numbers, statistics, click rates), the influence of system prompts is usually less significant. Numbers remain numbers.


In qualitative research—where nuances, context, and ambiguities are important—system prompts can have a massive impact:


Theme recognition: ChatGPT is instructed to create “diverse, inclusive, and exploratory scenarios.” This is fundamentally positive—but it could lead to diversity-related topics being overemphasized in your analysis, while other aspects are overlooked.


Sentiment analysis: If Claude skips flattery, positive sentiment signals could be systematically underestimated. Your “objective” sentiment analysis would then be skewed—without you even noticing.


Persona development: If Claude favors modern, bold designs, conservative user segments may be underrepresented in your personas. You think you have mapped the “typical users” – but you have only mapped the users who match Claude's preferences.


Problem 4: Data protection and responsibility

Data protection differences:

Anthropic does not automatically use user interactions for training – unless you actively opt in. Interestingly, rating responses (thumbs up/down) is already considered opt-in.


ChatGPT has different policies – depending on the subscription model and region.


If you are analyzing sensitive research data (e.g., interviews from the healthcare sector), this is a critical difference.


Bias responsibility:

If your research insights are influenced by invisible system prompt biases, who bears the responsibility?


  • You as a researcher, because you chose the tool?

  • The tool, because it built in the biases?

  • The provider, because they defined the system prompts?


This question remains unanswered—and it becomes increasingly relevant the more we rely on AI-supported research methods.


Practical recommendations for action: How to stay in control

The good news is that you are not helplessly at the mercy of this. There are concrete strategies for regaining control over AI-supported research.


1. Use user-defined prompts as a control mechanism

Custom instructions are your most important weapon against invisible biases.


Example of a research persona for custom instructions:

You are a UX research assistant with the following principles:


DESCRIPTIVE, NOT PRESCRIPTIVE

- Describe what is in the data

- Do not give design recommendations unless I explicitly ask for them


SEPARATION: OBSERVATION VS. INTERPRETATION

- Clearly mark what is direct observation

- Always label interpretations as such


CONFIDENCE LEVEL

- Tell me how confident you are about each insight

- Identify data gaps and uncertainties


NEUTRALITY

- No preference for modern vs. conservative designs

- Treat all user statements equally


FORMATTING

- Use lists if they increase clarity

- Use continuous text if context is important

- If unsure, ask me which format I prefer


Tool-specific compensations:

With Claude, I often add:

Use lists and bullet points if they improve clarity. Continuous text is not always better.


With ChatGPT, I write:

Avoid tables for narrative insights. Not everything needs to be structured.


2. Multi-tool triangulation for critical analyses

Basic rule: Any analysis that influences important decisions should be performed with at least two different tools.


My workflow:

  1. Initial analysis with my standard tool (usually Claude, because I like the tone)

  2. Second analysis with ChatGPT for cross-checking

  3. Comparison: Where do the results match? Where do they differ?

  4. Interpretation: Why might the discrepancies have arisen? Which tool biases play a role?

  5. Synthesis: Create final insights based on both analyses


Yes, this requires more effort. But for important research projects, this additional validation is worth the time.


3. Establish a research protocol

Documentation has always been important in research—with AI-supported research, it becomes essential.


Sample template:

RESEARCH LOG: [Project name]


TOOL SELECTION

- Primary tool: Claude Sonnet 4

- Secondary tool (validation): ChatGPT 4


CUSTOM INSTRUCTIONS USED

- Research persona (see above)

- Special instructions: “Treat all design preferences equally”


KNOWN TOOL BIASES

- Claude: Preference for modern designs, no lists in prose, skips flattery

- ChatGPT: Tendency toward tables and structured formats


CONTROL MEASURES

- Critical insights validated with both tools

- Positive user statements manually checked for underrepresentation

- Design feedback compared against raw data


DIFFERENCES BETWEEN TOOLS

- [Document specific differences in the results]

- [Interpretation: Why might these have arisen?]


FINAL DECISION

- [Which insights were included in the final report and why?]


4. Human-in-the-loop remains essential

AI is a tool for increasing efficiency – not a substitute for human judgment.


My workflow:

  1. AI makes initial synthesis (quick overview)

  2. I validate against raw data (samples from the transcripts)

  3. Critical interpretation by me (context that AI lacks)

  4. AI helps with formulation (formulating final insights)


AI speeds up the process – but I make the critical decisions.


5. Transparency towards stakeholders

Communicate AI use openly:


Weak wording:

“We analyzed the user interviews and identified the following insights...”


Better wording:

“We analyzed the user interviews with the support of Claude 4. To minimize tool-specific biases, we additionally validated critical insights with ChatGPT and checked them against the original transcripts on a random basis. The final insights are based on this multi-tool validation.”


This builds trust and demonstrates methodological rigor.


6. Develop a team-wide AI tool profile

In my team, we have a shared document that we update regularly:


“AI Tools in UX Research: Strengths, Weaknesses, Best Practices”


In it, we document:

  • Which tool is suitable for which research task?

  • What known biases does each tool have?

  • Which custom instructions have proven effective?

  • Lessons learned from past projects


The document is a living artifact—we are constantly learning and adapting our practices.


Checklist: Conducting AI-supported UX research responsibly


✅ Before the research

  • [ ] Document and justify tool selection

  • [ ] Define custom instructions for the research context

  • [ ] Identify known tool biases

  • [ ] Check data protection compliance (especially for sensitive data)

  • [ ] Decide: single-tool or multi-tool validation?


✅ During the research

  • [ ] For critical analyses: perform multi-tool comparison

  • [ ] Keep a research log (tool, biases, control measures)

  • [ ] Always validate AI output against raw data (random samples)

  • [ ] Clearly mark uncertainties and interpretations

  • [ ] Document discrepancies between tools


✅ After the research

  • [ ] Document methodology transparently

  • [ ] Reflect on tool influence on results

  • [ ] Inform stakeholders about AI use

  • [ ] Record learnings for future projects

  • [ ] Update team knowledge base


Frequently asked questions

Do I have to use multiple tools for every research task?

No. For exploratory analyses or internal interim reports, one tool is usually sufficient. However, for important decisions (e.g., strategic product pivots based on research), you should use multi-tool validation.


How can I find out which system prompts a tool uses?

The official system prompts are usually not published. However, there are leaked versions on GitHub (e.g., repository “system_prompts_leaks”). These are not always up to date, but they give a good impression of the differences.


Are custom instructions enough to compensate for tool biases?

Partially. Custom instructions help, but they cannot override all system prompt effects. Therefore, multi-tool validation is still useful for critical analyses.


Which tool is “best” for UX research?

There is no “best” tool – only tools that are better suited for specific tasks. Claude has advantages in empathy-driven analysis, ChatGPT in structured data synthesis. The tool selection should match the research task.


Can I use AI tools for GDPR-sensitive data?

That depends on the tool and your use case. Anthropic, for example, offers enterprise versions with GDPR compliance. For sensitive data, you should check with your legal department about the specific tools and their data protection policies.


Conclusion: AI tools are tools – treat them as such

System prompts are the invisible hand that controls AI-supported research. They influence how tools interpret, structure, and present your data – without you being able to see or fully control them.


The key message:

Invisible system prompts are a black box with potential for bias. User-defined prompts (custom instructions) are your most important control tool. Treat AI tools like any other research instrument: critically, documented, transparently.


Three immediate actions for today:

  1. Define your research persona as a custom instruction in your preferred tool

  2. Start a research log for your next project (use the template above).

  3. Try multi-tool validation for your next important analysis.


The AI revolution in UX research is unstoppable—and that's a good thing. But we need to shape it responsibly. That means transparency about tool usage, awareness of biases, and methodological rigor.


The more we as a UX community talk about these issues, the better our practices will become. Share your experiences, experiment with different tools, and document your learnings.


The best research insights don't come from blindly trusting AI – they come from treating it for what it is: a powerful but not neutral tool.


As of February 2026



Further resources

  • GitHub: system_prompts_leaks – Collection of leaked system prompts from ChatGPT, Claude, Gemini

  • Anthropic System Prompt Release Notes – Official documentation on Claude

  • Simon Willison: “Highlights from the Claude 4 system prompt” – Detailed analysis

  • Fortelabs: “A Guide to the Claude 4 and ChatGPT 5 System Prompts” – Practical comparison


Do you have experience with AI tools in UX research? What challenges do you encounter? Let's discuss in the comments.



💌 Want more? Then read on – in our newsletter.

Published four times a year. Sticks in your mind longer. https://www.uintent.com/de/newsletter

Surreal, futuristic illustration of a person seen from behind standing in a glowing digital cityscape.

System Prompts in UX Research: What You Need to Know About Invisible AI Control

PROMPTS, RESEARCH, UX, UX INSIGHTS

Abstract futuristic illustration of a person, various videos, and notes.

Summarizing YouTube Videos With AI: Three Tools Put to the Test in UX Research

LLM, UX, HOW-TO

two folded hands holding a growing plant

UX For a Better World: We Are Giving Away a UX Research Project to Non-profit Organisations and Sustainable Companies!

UX INSIGHTS, UX FOR GOOD, TRENDS, RESEARCH

Abstract futuristic illustration of a person facing a glowing tower of documents and flowing data streams.

AI Tools UX Research: How Do These Tools Handle Large Documents?

LLM, CHAT GPT, HOW-TO

Illustration of Donald Trump with raised hand in front of an abstract digital background suggesting speech bubbles and data structures.

Donald Trump Prompt: How Provocative AI Prompts Affect UX Budgets

AI & UXR, PROMPTS, STAKEHOLDER MANAGEMENT

Driver's point of view looking at a winding country road surrounded by green vegetation. The steering wheel, dashboard and rear-view mirror are visible in the foreground.

The Final Hurdle: How Unsafe Automation Undermines Trust in Adas

AUTOMATION, AUTOMOTIVE UX, AUTONOMOUS DRIVING, GAMIFICATION, TRENDS

Illustration of a person standing at a fork in the road with two equal paths.

Will AI Replace UX Jobs? What a Study of 200,000 AI Conversations Really Shows

HUMAN VS AI, RESEARCH, AI & UXR

Close-up of a premium tweeter speaker in a car dashboard with perforated metal surface.

The Passenger Who Always Listens: Why We Are Reluctant to Trust Our Cars When They Talk

AUTOMOTIVE UX, VOICE ASSISTANTS

Keyhole in a dark surface revealing an abstract, colorful UX research interface.

Evaluating AI Results in UX Research: How to Navigate the Black Box

AI & UXR, HOW-TO, HUMAN VS AI

A car cockpit manufactured by Audi. It features a digital display and numerous buttons on the steering wheel.

Haptic Certainty vs. Digital Temptation: The Battle for the Best Controls in Cars

AUTOMOTIVE UX, AUTONOMOUS DRIVING, CONNECTIVITY, GAMIFICATION

Digital illustration of a classical building facade with columns, supported by visible scaffolding, symbolising a fragile, purely superficial front.

UX & AI: How "UX Potemkin" Undermines Your Research and Design Decisions

AI & UXR, HUMAN VS AI, LLM, UX

Silhouette of a diver descending into deep blue water – a metaphor for in-depth research.

Deep Research AI | How to use ChatGPT effectively for UX work

CHAT GPT, HOW-TO, RESEARCH, AI & UXR

A referee holds up a scorecard labeled “Yupp.ai” between two stylized AI chatbots in a boxing ring – a symbolic image for fair user-based comparison of AI models.

How Yupp Uses Feedback to Fairly Evaluate AI Models – And What UX Professionals Can Learn From It

AI & UXR, CHAT GPT, HUMAN VS AI, LLM

A brown book entitled ‘Don't Make Me Think’ by Steve Krug lies on a small table. Light shines through the window.

Why UX Research Is Losing Credibility - And How We Can Regain It

UX, UX QUALITY, UX METHODS

3D illustration of a digital marketplace with colorful prompt stalls and a figure selecting a prompt card.

Buying, sharing, selling prompts – what prompt marketplaces offer today (and why this is relevant for UX)

AI & UXR, PROMPTS

Robot holds two signs: “ISO 9241 – 7 principles” and “ISO 9241 – 10 principles”

ChatGPT Hallucinates – Despite Anti-Hallucination Prompt

AI & UXR, HUMAN VS AI, CHAT GPT

Strawberry being sliced by a knife, stylized illustration.

Why AI Sometimes Can’t Count to 3 – And What That Has to Do With Tokens

AI & UXR, TOKEN, LLM

Square motif divided in the middle: on the left, a grey, stylised brain above a seated person working on a laptop in dark grey tones; on the right, a bright blue, networked brain above a standing person in front of a holographic interface on a dark background.

GPT-5 Is Here: Does This UX AI Really Change Everything for Researchers?

AI & UXR, CHAT GPT

Surreal AI image with data streams, crossed-out “User Expirince” and the text “ChatGPT kann jetzt Text in Bild”.

When AI Paints Pictures – And Suddenly Knows How to Spell

AI & UXR, CHAT GPT, HUMAN VS AI

Human and AI co-create a glowing tree on the screen, set against a dark, surreal background.

When the Text Is Too Smooth: How to Make AI Language More Human

AI & UXR, AI WRITING, CHAT GPT, HUMAN VS AI

 RELATED ARTICLES YOU MIGHT ENJOY 

AUTHOR

Tara Bosenick

Tara has been active as a UX specialist since 1999 and has helped to establish and shape the industry in Germany on the agency side. She specialises in the development of new UX methods, the quantification of UX and the introduction of UX in companies.


At the same time, she has always been interested in developing a corporate culture in her companies that is as ‘cool’ as possible, in which fun, performance, team spirit and customer success are interlinked. She has therefore been supporting managers and companies on the path to more New Work / agility and a better employee experience for several years.


She is one of the leading voices in the UX, CX and Employee Experience industry.

bottom of page