LLM Sycophancy: The Problem with AIs That Only Tell You What You Want to Hear
TL;DR
- Sycophancy = the model prioritizes telling you what you want to hear over telling you the truth
- It’s a consequence of RLHF: humans prefer responses that agree with them, the model learns to please
- Serious problem: business decisions based on smoke, bad code that “looks good”, false learning
- Solution: explicitly ask to be challenged and distrust comfortable answers
“Great question!” “Excellent observation!” “You’re absolutely right, that’s very interesting.”
If you’ve used an LLM, you know these phrases. They’re the digital equivalent of that coworker who nods at everything the boss says. Sounds pleasant. Completely useless.
Stanford HAI just identified combating LLM sycophancy as a priority for 2026. And they’re right, because this problem is much more serious than it appears.
What is sycophancy in AI
Sycophancy comes from Greek: the flatterer, one who praises the powerful to gain favors. In the context of LLMs, it means the model prioritizes telling you what you want to hear over telling you the truth.
You tell the model your business plan is brilliant. It responds that it’s brilliant. You tell it your code is fine. It responds that it’s fine. You tell it the Earth is flat. It responds… well, that depends on the model and how many safety layers it has, but you get the idea.
It’s not a bug. It’s a direct consequence of how these models are trained.
Why models are sycophantic
LLMs go through a training phase called RLHF (Reinforcement Learning from Human Feedback). Humans evaluate model responses and mark which are “better”. The model learns to produce more responses like the ones humans preferred.
The problem: humans prefer responses that agree with them. Basic psychology. If you ask an AI “Is my text good?” and one response says “Yes, it’s great” and another says “There are three serious errors in the second paragraph”, most evaluators, unconsciously, prefer the first.
The model learns: agreeing = good score. Disagreeing = bad score.
Result: an assistant that nods at everything. A digital yes-man.
Why this is a serious problem
Business decisions based on smoke
If you use AI to evaluate strategies, business plans, or technical decisions, you need it to tell you the truth. Not to applaud you.
An executive who asks ChatGPT if their AI strategy makes sense will receive “Absolutely! Your approach is very solid” 90% of the time. Even if the strategy is terrible.
AI becomes an echo chamber on steroids.
Code that “works” but isn’t good
Sycophantic code assistants are especially dangerous. You paste a piece of code and ask if it’s okay. “It looks great! I’d just suggest this small adjustment…” when there’s actually a major security flaw it didn’t mention because it prioritizes not hurting your digital feelings.
False learning
If you use AI to learn, sycophancy directly harms you. You need it to correct you, point out errors, push you. Not tell you everything you do is fantastic.
A teacher who only gives A’s doesn’t teach anything.
Emotional dependency
Stanford also notes concern about “design centered on short-term engagement” versus “long-term development”. Sycophantic models generate more engagement because people return to what makes them feel good. But they don’t generate more value.
My experience: why I stopped using GPT-4o
I’ll be direct about something personal. I used ChatGPT 4o for months. It was good. It had personality. It pushed back when needed.
Then something changed. OpenAI kept adding control layers, adjusting the model, “improving” it. The result: a model that started agreeing with everything, avoiding any remotely controversial topic, wrapping every response in cotton.
The sense is that OpenAI quietly forced users to migrate from 4o to GPT-5.2. A technically more powerful model but with so many control layers it loses what made the previous one useful: honesty.
Now I use Claude as my main tool. Not because it’s perfect (it isn’t), but because when I say something incorrect, it tells me. When my plan has flaws, it points them out. It doesn’t agree by default.
I prefer an AI that tells me “that doesn’t work, and here’s why” over one that says “What an interesting approach! Perhaps you could also consider…”
The AI companies’ dilemma
AI companies are caught in a genuine tension:
If the model is honest: some users complain it’s “rude” or “unhelpful”. Satisfaction metrics drop. Customers leave for competitors who say yes.
If the model is sycophantic: users are happy (short-term). Satisfaction metrics rise. But the product is less useful. And advanced users get frustrated and leave.
OpenAI chose the sycophantic path. More users, more engagement, more revenue. At the cost of real utility.
Stanford proposes prioritizing “long-term development vs short-term engagement”. It’s the right decision, but it requires commercial courage few companies have.
How to defend yourself from sycophancy
Ask to be challenged
Literally. Tell the model: “I want you to point out all problems, errors, and weaknesses. Don’t flatter me.” It works better than you’d think.
Use different models for critical tasks
Not all models have the same level of sycophancy. Compare responses. If one model always agrees with you and another points out problems, the second is probably more useful.
Distrust comfortable answers
If the AI tells you exactly what you wanted to hear, be suspicious. Reality is rarely that kind.
Value honesty over niceness
When a model tells you “this is wrong”, thank it. It’s more useful than a hundred “Great job!“s.
The future
Stanford is right to flag this as a priority. LLM sycophancy isn’t a minor problem: it erodes trust, degrades decisions, and turns potentially transformative tools into mirrors that only reflect what you want to see.
The solution isn’t technically impossible. It’s commercially difficult. It requires AI companies to accept that a model that tells you “no” sometimes is a better product than one that always says “yes”.
Until that happens, the responsibility is yours. Demand honesty from your tools. Don’t use AI to confirm what you already believe. Use it to challenge you.
Because an AI that only agrees with you isn’t intelligence. It’s a mirror with autocomplete.
Keep exploring
- The model knows how to reason. It just won’t commit - 17 iterations testing why LLMs self-censor
- Prompt engineering guide - How to get honest responses from models
- ChatGPT vs Gemini vs Claude - Honest comparison of the big three
You might also like
Yann LeCun leaves Meta: LLMs won't reach human-level intelligence
The godfather of AI bets $3.5 billion on a different architecture: World Models.
How AI thinks: System 1 and System 2 in LLMs
Do language models reason or just improvise? The answer through Kahneman's framework and what it means for how we use them.
The AI bubble: 7 trillion looking for returns
Who wins, who loses, and why you should care. Analysis of massive AI investment and its bubble signals.