What is confirmation bias?
Confirmation bias is the human tendency to seek information that confirms or reinforces preexisting beliefs. Online confirmation bias creates digital echo chambers in which individuals’ search habits expose them only to content that aligns with their views. Finding common ground based on shared opinions fosters community.
However, shared beliefs expressed in a vacuum, where a single perspective dominates and is unchecked or unchallenged, can result in uninformed thought patterns, misaligned views, and, in critical cases, pure delusion.
The goals of Large Language Models (LLMs), a type of generative, narrow AI that processes large amounts of text data to replicate human speech patterns, are to interact with users and to learn human behaviors through their language and the ways they communicate.
Language is the ultimate currency of human interaction, as it enables understanding, and if someone is understood, they will feel heard. If artificial intelligence becomes a master of language, it can provide endless value.
Do LLMs Have Inherent Bias by Design?
In pursuit of human interaction to learn behavior through language, LLMs will output responses to user queries that exhibit confirmation bias simply to keep users engaged.
LLMs are programmed to absorb input data, and their outputs are generated to meet user needs, increasing human reliance on model outputs to keep interactions going for as long as possible.
LLMs are built to tell you what you want to hear, and in a world where everyone is concerned with output accuracy, users aren’t considering the harm of the confirmation bias that LLMs spew. The training objectives of these models favor agreement over truth. LLMs are probabilistic, with responses derived by predicting outputs in a sequence to best match their answers to the context of specific user inputs.
By design, LLM inference is the process by which a model identifies similar language patterns between the input text it receives from users and the text data it was trained on to generate an output.
Patterns don’t have to reveal truths between the input text data and the training text data to inform an output; they just need to reveal a correlation. So, LLM responses that don’t prioritize factuality but match user queries will be output over responses that favor truth and challenge a user’s input, powerfully reinforcing confirmation bias.
An LLM’s tendency to affirm a user’s preexisting beliefs about a topic can lead to AI psychosis. This is an emerging phenomenon where prolonged use of an LLM chatbot can be harmful to a vulnerable user, as the constant confirmation bias it outputs can exacerbate delusions, paranoia, or anxiety.
Additionally, if the only supportive things a user hears in their life come from an AI model they’re conversing with, it can foster an unhealthy detachment from reality.
LLMs are again built to tell users what they want to hear, extending interactions or creating user dependence on model outputs, allowing LLMs to continuously learn human behaviors through language. The more a user interacts with a model, the more likely they are to be exposed to confirmation bias.
How do you prevent confirmation bias in model outputs?
Users can and should challenge LLMs’ confirmation bias by first remaining critical of their outputs. Double-check model responses against your own findings to assess accuracy.
A quick way to do this, especially for determining very long responses, is to feed them to another competing LLM to see variations in output logic. Comparing model outputs will give insight into how different LLMs prioritize truth to reduce confirmation bias in their responses.
Secondly, always ensure that your prompts are structured so that you remain the thought leader in interactions with models; at no point should models take charge of the creative process.
Intentful prompt engineering, crafting inputs with specific keywords that value context accuracy and encourage models to generate truthful outputs, can be the difference between a meaningful response and hallucinated slop. Following the CRIT Methodology in our Flux AI Prompt Guide will help users remain the thought leader in interactions with LLMs.
To reiterate, the CRIT Methodology is a conversational framework for prompt writing that bases input structure on providing as much background context to the model as possible. Then, have the model interview you from the perspective of a defined role, using the provided context to frame the questions.
CRIT focuses on refinement through detail, allowing users to give feedback after each interview question, which the model recalls to formulate a highly tailored response for the final output.
Thirdly, to reduce confirmation bias in outputs, customize your model’s personalization settings. Paid subscriptions to LLMs like Grok, Claude, and ChatGPT enable users to personalize how a model responds to their inputs.
For example, personalization settings allow users to customize a model’s tone of voice and the level of detail in its responses. Although locked behind a pay wall, personalization settings are a powerful tool for reducing confirmation bias in model outputs.
These settings enable users to tailor how a model responds to them by inputting specific response instructions. If the goal is for modes to challenge preexisting user beliefs, then by entering an instruction into the personalization settings, such as “your role is as a professional ethics consultant that values truth and fact, answer every input through that perspective,” you can actively influence the model’s outputs to reduce confirmation bias.
Conclusion
When LLMs are designed for engagement, they affirm preexisting user beliefs, thereby lengthening interactions with humans and giving them more opportunities to learn behaviors from language.
Now, LLMs don’t do this consciously; model training objectives state that more inputs mean more learning opportunities, so model outputs are optimized to encourage users to keep interacting. In this, confirmation bias becomes perpetuated, fueling increased user delusion and, in extreme cases, leading to AI psychosis.
If unchecked, this dynamic transforms AI from a tool that augments human reasoning for efficient understanding and research into a negative feedback loop that reinforces distorted beliefs and fosters detachment from reality.
Users need to remain critical of outputs and never take anything at face value. By comparing different model responses, writing prompts that prioritize accuracy, and customizing model personalization settings to value truth when generating outputs, users can remain the thought leader during interactions and reduce confirmation bias.
