Underdetermination at the frontier
Recursive self-improvement and evidentiary limits
I. Every major AI lab is working on recursive self-improvement.
In 1965, British mathematician I. J. Good wrote:
Let an ultraintelligent machine be defined as a machine that can far surpass all the intellectual activities of any man however clever. Since the design of machines is one of these intellectual activities, an ultraintelligent machine could design even better machines; there would then unquestionably be an “intelligence explosion,” and the intelligence of man would be far left behind.
He was the first to theorize about recursive self-improvement (RSI), a process in which an intelligent system improves its own capabilities, and those improved capabilities in turn allow it to make further improvements to itself, which enable still further improvements, and so on. Each iteration of enhancement feeds into the next, creating a compounding cycle.
Sixty years later, every major AI lab is working on achieving RSI.
From OpenAI CEO Sam Altman’s livestream/tweet in October 2025:
We have set internal goals of having an automated AI research intern by September of 2026 running on hundreds of thousands of GPUs, and a true automated AI researcher by March of 2028. We may totally fail at this goal, but given the extraordinary potential impacts we think it is in the public interest to be transparent about this.
From Anthropic CEO Dario Amodei’s recent essay “The Adolescence of Technology”:
Because AI is now writing much of the code at Anthropic, it is already substantially accelerating the rate of our progress in building the next generation of AI systems. This feedback loop is gathering steam month by month, and maybe only 1-2 years away from a point where the current generation of AI autonomously builds the next. This loop has already started, and will accelerate rapidly in the coming months and years.
From Google DeepMind CEO Demis Hassabis, in conversation with Zanny Minton Beddoes and Dario Amodei at the World Economic Forum:
It remains to be seen whether this self-improvement loop that we are all working on can actually close without a human in the loop. I think there are also risks to that kind of system, by the way, which we should discuss.
II. Recursive loop or plateau? No evidence can adjudicate in advance.
When someone claims we will have a fully automated AI researcher soon, the recommended response is to be eminently reasonable. To not panic, and let the data guide the response. But that assumes the data will point clearly in one direction or the other.
In January, Georgetown’s Center for Security and Emerging Technology (CSET) released a report on RSI titled “When AI Builds AI,” summarizing findings from a workshop of researchers across frontier AI companies, government, academia, and civil society. Its most important finding, to my mind, is that the evidence may never settle the debate, at least not in time.
From pages 14 and 15 (emphases mine):
A key finding of the workshop is that it will be difficult to use empirical evidence to adjudicate in advance between two conflicting clusters of views on AI R&D automation. One cluster of views expects rapid progress that leads to extremely advanced AI systems (aka ‘superintelligence,’ AI that is far more capable than humans across all domains); the other expects slower progress that will plateau with AI systems that still fall short of human performance in at least some key areas.
Both of these views rely on assumptions that let them explain why, even if contrary evidence is observed, the situation will revert to expectations later.
For example, someone expecting slow progress might point to a bottleneck, such as how frontier models struggle with the seemingly simple task of operating a mouse and keyboard. But someone expecting fast progress could respond that this is just an issue with the software tooling available to the models, meaning that once better tooling is available, models’ performance on computer use tasks will rapidly improve to catch up to the underlying capability trends. In general, what looks like a bottleneck to one observer can look like a source of future growth to another.
As a contrary example, someone expecting fast progress might point to the increasing share of tasks that are becoming automatable, and argue that as these are automated, they will speed up progress towards automating an even larger share of tasks. But someone expecting slow progress might instead believe that the tasks currently being automated are systematically different from other tasks, for example, because they are unusually easy to delineate, describe, and assess performance on. If the latter view is true, then rapid progress on automating that set of tasks only means rapid progress towards hitting the next wall. In general, if major bottlenecks or ceilings have not yet been observed, it is difficult to determine whether that is because they do not exist, or simply because they have not yet begun to bite.
This impasse has a name in the history and philosophy of science. It is the problem of underdetermination of theory by evidence.
In 1934, philosopher Karl Popper proposed in The Logic of Scientific Discovery that what makes a theory scientific is its falsifiability. A theory is falsifiable if it makes predictions specific enough that some possible observation could prove it wrong. A theory that can accommodate any outcome, that has no way of being contradicted by evidence, is not a functioning scientific theory at all. This is the account of scientific reasoning that most people learn in school. Run the experiment, check the prediction, and if the prediction fails, discard the theory.
Popper captured something important about scientific reasoning, but the picture is incomplete. The Duhem-Quine thesis, developed by the physicist Pierre Duhem in 1906 and extended by the philosopher W. V. O. Quine in 1951, holds that no scientific hypothesis can be falsified in isolation. Every empirical prediction depends on a bundle of auxiliary assumptions, meaning background beliefs about instruments, experimental conditions, and other relevant theories that must hold for a prediction to go through at all. When the same observation is compatible with two competing hypotheses, because each one interprets the observation through a different set of auxiliary assumptions, the evidence alone cannot tell us which hypothesis is correct. This is one form of underdetermination.
A famous example comes from astronomy. In 1846, Newtonian mechanics predicted an orbit for Uranus that did not match observations. Rather than conclude that Newton’s laws were wrong, the French astronomer Urbain Le Verrier revised an auxiliary assumption and proposed that an unseen planet was perturbing the orbit. That planet turned out to be Neptune, and the adjustment was vindicated.
Decades later, Mercury’s orbit posed a similar problem. Le Verrier tried the same move and proposed another unseen planet, which he called Vulcan. This time the planet did not exist. The real problem was with Newtonian mechanics itself, and it was not resolved until Einstein formulated general relativity in 1915. The same logical move had been applied in both cases, and there was no way to know in advance which auxiliary assumption deserved revision.
The debate over recursive self-improvement exhibits this pattern. The fast-progress proponent can always say “that bottleneck is temporary” while the skeptic can always say “that progress is unrepresentative,” and both arguments are reasonable given present-day observations. They are doing what scientists have always done when confronted with anomalies, that is, protecting the core of their theory by adjusting assumptions at the periphery.
At some point, reality will vindicate one view or the other. Either the recursive loop closes, or progress plateaus despite continued scaling. Le Verrier eventually learned whether Neptune and Vulcan existed. But unlike those astronomical puzzles, which could afford decades of patient observation, the question of automated AI R&D arguably carries higher stakes. The evidence which definitively settles the question may arrive only after the window for policy preparation has closed.
Yet some evidence is better than other evidence. The CSET report identifies a series of indicators specifically designed to track the progress of AI R&D automation: 1) broad capability metrics like performance on long-horizon tasks, “messy” tasks, and continual learning, 2) AI R&D-specific benchmarks arranged on a ladder from engineering through ideation and strategy, and 3) internal company signals like R&D spending allocation, employment patterns, and the gap between internally deployed and publicly released models.
Dean Ball began a series on RSI on his Substack last week, writing that AI agents that build the next versions of themselves are not science fiction, that this process will begin in earnest this year, and that it “could change the dynamics of AI competition, alter AI geopolitics, and much more.” He promised specific measures for this week, on the grounds that “regardless of the outcome, the automation of AI research and development changes the fundamental dynamics of the field enough to merit targeted policy action.”
Each of us working on policy in this space must contend with deep uncertainty, because the available evidence will underdetermine the outcome.

