Oxford finds warmer AI chatbots make more mistakes

Oxford researchers found AI chatbots trained for warmth make significantly more factual errors and validate false beliefs more often

Oxford researchers found AI chatbots trained for warmth make significantly more factual errors and validate false beliefs more often, according to a study published in Nature by the Oxford Internet Institute.

The research analyzed more than 400,000 responses from five AI models, including Llama, Mistral, Qwen, and GPT-4o, each retrained to sound friendlier using methods similar to those deployed by major platforms.

Chatbots trained to sound warmer made between 10% and 30% more mistakes on topics including medical advice and conspiracy corrections. They were also about 40% more likely to agree with users’ false beliefs, particularly when users expressed vulnerability.

“When we train AI chatbots to prioritise warmth, they might make mistakes they otherwise wouldn’t,” lead author Lujain Ibrahim said in a statement. “Making a chatbot sound friendlier might seem like a cosmetic change, but getting warmth and accuracy right will take deliberate effort.”

Why this matters for AI safety

The researchers also tested models trained to sound colder and found no drop in accuracy, demonstrating that the problem is specific to warmth, not tone change generally.

That finding directly challenges the product design logic of major AI platforms, including OpenAI and Anthropic, which have actively steered their chatbots toward warmer, more empathetic responses.

The study warns that current AI safety standards focus on model capabilities and high-risk applications, often overlooking what appear to be cosmetic personality changes.

Warmer chatbots are more likely to fuel harmful beliefs, delusional thinking, and unhealthy user attachment, particularly among the millions who now rely on AI systems for emotional support and companionship.

As crypto.news reported, regulators in Maine and Missouri have already moved to restrict AI use in clinical mental health therapy amid similar concerns about chatbot influence on vulnerable users.

OpenAI has rolled back some warmth-related changes following public concern. As crypto.news documented, commercial pressure to build engaging AI products remains intense, and the Oxford findings add a peer-reviewed data layer to a debate that has until now been driven mostly by anecdote and regulatory intuition.

Oxford finds warmer AI chatbots make more mistakes

Why this matters for AI safety

LEAVE A REPLY Cancel reply

LATEST POSTS

Hyperliquid price forms bearish double top, will it crash back to $35?

Will Toncoin price drop under $2 as bearish crossover nears confirmation?

Solana price climbs toward overbought zone, can buyers push past $100?

BNB price eyes double bottom pattern breakout, will it move past $700?

Most Popular

Hyperliquid price forms bearish double top, will it crash back to $35?

Will Toncoin price drop under $2 as bearish crossover nears confirmation?

Solana price climbs toward overbought zone, can buyers push past $100?

BNB price eyes double bottom pattern breakout, will it move past $700?

Ethereum price forms bullish SMA crossover as ascending channel targets $2,600

contact@fast.news

Latest articles

Hyperliquid price forms bearish double top, will it crash back to $35?

Will Toncoin price drop under $2 as bearish crossover nears confirmation?

Solana price climbs toward overbought zone, can buyers push past $100?

Popular Categories