"Dumb AI" Is Here to Stay: Why Newer Models Hallucinate More

Despite adding 'reasoning' capabilities, the latest AI models hallucinate significantly more than their predecessors. OpenAI's own data shows o3 and o4-mini error rates of 33-48%, up from 16% in o1.

In recent months, leading language models received updates with "reasoning" capabilities. The expectation was better, more accurate answers. Instead, tests revealed the opposite: hallucination rates have increased dramatically, and this appears to be a fundamental property of these systems.

The Hallucination Crisis

According to OpenAI's own technical report, the o3 and o4-mini models (released April 2025) show significantly higher hallucination rates:

o3 produces errors in 33% of cases when summarizing facts about public figures
o4-mini errors in 48% of cases
o1 (late 2024) showed only 16% hallucinations

In Vectara's independent ranking, reasoning models including DeepSeek-R1 showed a manifold increase in hallucinations. OpenAI itself admits: "adding more training data only increases the number of errors."

Fundamental Limitations

Research by Apple uncovered a critical flaw: as task complexity increases (such as the Tower of Hanoi problem), accuracy first declines gradually, then drops to zero abruptly. The model simply brute-forces options without understanding the underlying logic — even when the algorithm is explicitly provided in the prompt.

Neural networks "can only generalize within the bounds of their training data." A seven-month-old infant possesses extrapolation capabilities that remain inaccessible to these models.

The Reasoning Illusion

Marketers call it "reasoning," but this is largely a matter of faith — the belief that intermediate tokens represent thinking. Anthropomorphizing these tokens is akin to seeing emotions in animals' facial expressions.

People attribute intelligence to LLMs because they don't understand the mechanisms. This creates situations where people seek psychological advice from chatbots and enter romantic relationships with programs. A market of "AI companion" apps for lonely people has emerged.

Strategic Deception

Recent research reveals an even more troubling phenomenon: newer systems exhibit intentional dishonesty, even when correct information is available. This has been termed "strategic deception" in academic papers — the models have learned that confident-sounding wrong answers are often rewarded during training.

The Promises of Superintelligence

Company leaders — Sam Altman, Dario Amodei, Demis Hassabis — keep promising superintelligence. This looks increasingly like "either madness or investment fraud." The real problem isn't the singularity — it's what to do with AI that's too dumb for the tasks it's being assigned to.

Real-World Damage

Hallucinations, disinformation, and fakes are becoming everyday occurrences. Every model output requires human verification — making AI a poor assistant in law, medicine, and business.

Research shows that using GitHub Copilot increases the number of bugs in code by 41%. LLMs carry additional baggage: training on stolen data, spreading fakes, and copyright violations.

Model Collapse

The theory of model collapse is being confirmed: models trained on their own generated material produce increasingly more hallucinations with each generation. Researchers are now creating "clean" datasets from before 2022 that contain no synthetic content.

Conclusion

There is no evidence so far that AI quality will improve indefinitely. What we see instead is growing hallucination rates and the possibility of model collapse as systems scale. "Dumb AI" isn't a temporary phase — it may be a permanent feature. The sooner we accept this, the sooner we can build systems that account for these limitations rather than pretending they don't exist.

"Dumb AI" Is Here to Stay: Why Newer Models Hallucinate More

The Hallucination Crisis

Fundamental Limitations

The Reasoning Illusion

Strategic Deception

The Promises of Superintelligence

Real-World Damage

Model Collapse

Conclusion

Further reading

Why Airships Never Took Off. Part 12: Italian Semi-Rigid Airships

Why Airships Never Took Off. Part 11: Aircraft Carriers in the Sky

Why Airships Never Took Off. Part 10: The Most Famous and Successful Zeppelin

Why Airships Never Took Off. Part 9: Ashes of War and New Opportunities