OpenAI has released new state-of-the-art reasoning AI models, named o3 and o4-mini, which ironically hallucinate—or generate false information—more frequently than their older counterparts. Unlike previous patterns of improvement, these new models double the hallucination rate in internal tests compared to earlier versions. Even though o3 and o4-mini perform better in some areas like coding and math, they also tend to make many more both accurate and inaccurate claims. OpenAI and independent researchers are uncertain why scaling up reasoning models correlates with increased hallucinations, prompting ongoing research. While additional features like web search can boost accuracy, the growing hallucination issue raises concerns for business applications that demand high accuracy and reliability.





























