The analysis finds improvements in AI reasoning may slow, suggesting a potential plateau in model development after years of rapid progress
An analysis conducted by Epoch AI, a nonprofit AI research institute, indicates that the AI industry may not be able to achieve significant performance improvements from reasoning AI models for an extended period.
The report’s findings indicate that progress from reasoning models may slow down within a year.
In recent months, AI benchmarks have substantially improved, particularly those that evaluate programming and mathematical abilities, due to reasoning models like OpenAI’s o3.
These models can enhance their performance by applying additional computational resources to problems. However, they require a longer time to complete tasks than traditional models.
Developing reasoning models involves the initial training of a conventional model on a vast quantity of data, followed by applying a technique known as reinforcement learning.
This technique gives the model “feedback” on its solutions to challenging” “problems.
According to Epoch, frontier AI labs such as OpenAI have not yet allocated significant computing power to the reinforcement learning phase of reasoning model training.
That is about to change. OpenAI has stated that it utilized approximately 10 times as much computing power to train o3 than its predecessor, o1. Epoch believes that most of this computing power was allocated to reinforcement learning.
Dan Roberts, a researcher at OpenAI, recently disclosed that the organization’s future objectives include prorganization’snforcement learning, which will consume significantly more computing power than the initial model training.
However, there is still a limit to the amount of computation that can be applied to reinforcement learning per Epoch.

Josh You, the author of the analysis and an analyst at Epoch, elucidates that the performance gains from conventional AI model training are currently quadrupling annually. In contrast, the performance gains from reinforcement learning rapidly increase by tenfolvery 3-5 months.

He proceeds,” The overall frontier will likely converge” with the progress of reasoning training by 2026.”
A variety of assumptions informs Epoch’s “analysis and is partially based on Epoch’s statements from executives of AI companies. However, it also argues that scaling reasoning models may be complex for reasons other than computation, such as the high overhead costs associated with research.
“If a persistent overhead cost is necessary” for research, reasoning models may not scale as far as anticipated, “You write. Rapid compute scaling has the p “potential to be a critical component of the progression of reasoning models; therefore, it is worthwhile to monitor it closely.
The AI industry, which has allocated tremendous resources to developing these models, is likely to be concerned by any indication that reasoning models may reach a limit shortly.
Research has demonstrated that reasoning models, which can be exceedingly costly to operate, possess significant deficiencies, including an increased propensity to experience hallucinations in comparison to specific conventional models.