HomeScience & EducationAI Solved a Decades-Old Math Problem, Leaving Mathematicians to Question the Future...

AI Solved a Decades-Old Math Problem, Leaving Mathematicians to Question the Future of Their Discipline

Last Modification

Article NLP Indicators
Sentiment 0.00
Objectivity 0.90
Sensitivity 0.10

AI has solved an 80-year-old math problem, sparking debate on its role in mathematics. OpenAI’s model disproved the Erdős unit distance conjecture, while AI collaborations yielded original research. Mathematicians grapple with AI’s rise, questioning its impact on discovery and the future of the field.

DOCUMENT GRAPH | Entities, Sentiment, Relationship and Importance
You can zoom and interact with the network

Something strange is happening in the world of mathematics. For centuries, progress moved at the pace of human insight — a single breakthrough could take years, decades, or generations. That rhythm is breaking.

AI systems are now solving problems that have stumped mathematicians for 80 years. The people who understand these proofs best are the ones most unsettled by it.

In just 24 months, AI has jumped from performing at the level of an Olympiad silver medalist to generating original research mathematics. The speed of change has left many mathematicians questioning what their profession will look like in five years [6].

From Olympiad Silver to Original Discovery

The timeline tells the story. In 2024, Google DeepMind’s AlphaProof solved four out of six International Mathematical Olympiad problems, earning a silver medal [1]. By 2025, both Google and OpenAI had achieved gold-level IMO performance.

These were competition problems — hard, but contained.

The real shock came in 2026. In January, mathematician Ravi Vakil at Stanford published a proof involving sphere-like shapes and flag varieties that was developed “in conjunction with Google Gemini.” The AI had identified a missing mathematical structure that had gone unnoticed [2].

Later that month, Tony Feng at UC Berkeley published a paper where the “core mathematical content” was generated entirely by Google’s Aletheia AI. The work bridged algebraic geometry and number theory in connection with the Langlands programme [2].

Then came May. OpenAI announced that a general-purpose reasoning model — the same kind that powers ChatGPT — had disproved the Erdős unit distance conjecture, an 80-year-old problem in combinatorial geometry [3].

“This guy's got a shovel. This guy's got a pickax. Together we can bore a tunnel.”

— Terence Tao

The AI connected the geometry problem to algebraic number theory using the Golod-Shafarevich criterion, an approach from a completely different domain. Nine mathematicians, including Fields Medalist Tim Gowers, verified and refined the proof [3].

“This is the first result produced autonomously by an AI that I find interesting in itself,” said University of Toronto mathematician Daniel Litt [3].

The Democratisation of Discovery

Something else is changing too. These breakthroughs are no longer confined to elite research labs with million-dollar compute budgets. Amateurs are getting in on the action.

In May 2026, Liam Price — someone without a formal mathematics degree — used ChatGPT 5.5 Pro to solve Erdős problem 1196, an open question about primitive sets and the Von Mangoldt function [1]. The solution was verified by Thomas Bloom at the University of Manchester, who maintains the official archive of Erdős problems.

More importantly, the AI-generated technique cascaded. Jared Lichtman at Stanford, who had spent much of his PhD on a related problem, extended Price’s approach with co-authors including Terence Tao to solve a related 60-year-old conjecture [1]. “This is perhaps one of the first examples of an AI-generated proof having downstream impacts,” Lichtman said [1].

Google DeepMind matched the pace. Days after the OpenAI announcement, its AlphaProof Nexus system solved nine separate open Erdős problems — two of which had been open for 56 years — and proved 44 additional OEIS conjectures [3].

The computing cost was “a few hundred dollars” per solution [3]. The system pairs Gemini with the Lean proof assistant, meaning every logical step is automatically verified by a compiler.

A Conference, a Bet, and a Room Half Empty

In April 2026, a group of mathematicians gathered in San Francisco. Organized by Jacob Tsimerman at the University of Toronto and Daniel Litt, the meeting brought together mathematicians and AI company employees from OpenAI and Google. The goal was to create a better benchmark for measuring AI’s mathematical ability [1].

AI Cracked an 80-Year-Old Math Problem. Now Mathematicians Are Wondering What's Left.

The mood was tense. Tsimerman laid out a vision of the future as a “slot machine” — press a button, get results.

He asked the room who would continue as mathematicians if that future arrived. Only half raised their hands [1].

Daniel Litt had made a bet in March 2025 with 3:1 odds that AI would not autonomously produce top-tier research papers by 2030. In February 2026, he updated his assessment publicly: “I now expect to lose this bet” [5]. The speed of improvement had outstripped his own predictions.

The Two Camps: Hybrid or Autonomous

Opinions among mathematicians divide into two camps. The hybrid optimists see AI as a powerful collaborator.

Terence Tao at UCLA, one of the most influential mathematicians alive, described the partnership as tunneling through a mountain: “This guy’s got a shovel. This guy’s got a pickax. Together we can bore a tunnel” [2].

Ravi Vakil put it simply: “The future will be some combination of human and machine” [1]. Alex Kontorovich at Rutgers said he could now imagine projects that “would have taken me five years” being doable in a single summer [1].

The other camp is more cautious. Akshay Venkatesh at the Institute for Advanced Study warned that AI risks causing mathematicians to lose direct experience with mathematical understanding [2]. Joel David Hamkins, a logician, described being “overwhelmed by this ocean of slop” flooding journal systems [2].

Jeremy Avigad, director of the Institute for Computer-Aided Reasoning in Mathematics at Carnegie Mellon, was blunt: “We have to face up to the fact that AI will soon be able to prove theorems better than we can” [1].

What AI Still Cannot Do

“This is the first result produced autonomously by an AI that I find interesting in itself.”

— Daniel Litt

For all the progress, the limitations are real. AI models produce both correct and incorrect mathematics, and telling the difference requires expert human judgment [5]. OpenAI itself could not consistently tell which of its own solutions in the First Proof benchmark were correct [4].

AI cannot yet build new theories from scratch. It has not invented anything like schemes or perfectoid spaces.

It struggles with concepts that emerged after its training data cutoff. And it has no sense of what makes a problem interesting — only humans can decide what is worth proving [5].

Mathematician Melanie Wood at Harvard warned against assuming that the skills that correlate in humans will correlate in AI [1]. Daniel Litt, despite his pessimistic bet, argued that mathematical understanding itself is something a model cannot provide: “What I actually care about is understanding things. A model can’t understand something for you” [1].

The First Proof project, a rigorous benchmark created by leading mathematicians, found that even the best AI systems could solve only 2 out of 10 research-level problems in their first round [4]. The problems were from the authors’ own unpublished work — the kind of daily tasks working mathematicians face, not exotic open questions.

A 2026 survey paper published on arXiv confirmed that while AI can now prove research-level theorems both formally and informally, the technology remains a tool for assisting mathematicians rather than replacing them [6].

What Comes Next

Several mathematicians working at AI companies believe a Millennium Prize Problem — the Riemann hypothesis, the Birch and Swinnerton-Dyer conjecture — could fall within “several years” [1]. Others caution that these are in a “wildly different class of difficulty” from anything AI has tackled so far.

What is clear is that mathematics is entering a period of transformation unlike anything in its history. The scarce skill, as Terence Tao observed, “is no longer finding the proof. It is choosing which problem is worth proving, and deciding what the result actually means” [3].

The golden age is here. Mathematicians are freaking out. And that might be exactly the right response.

Related Articles

SMI Science Desk
SMI Science Desk
SMI Science Desk is the scientific and research editorial team at SoMuchInfo, focused on breakthroughs in physics, space exploration, artificial intelligence, and emerging scientific discoveries. The team analyzes findings from academic research, simulations, and institutional reports, transforming complex topics into clear, accessible insights. Content is curated from verified sources and enhanced using AI-assisted workflows, with human editorial review to ensure accuracy and clarity.

Follow Us

YOU MAY LIKE

Top Tags

Latest articles

BPO Market’s $435 Billion Expansion Amid AI Automation Threats

AI-driven automation is reshaping the $435 billion BPO market, with startups dismantling traditional roles while Philippines and India see rising call-center employment. Jevons Paradox explains how AI efficiency fuels demand, creating a paradox of disruption and growth amid shifting workforce dynamics and outcome-based pricing models.

Fire Tornadoes Present Faster, Cleaner Solution for Oil Spill Response

Researchers from Texas A&M and UC Berkeley have developed controlled fire tornadoes, or "fire whirls," that burn oil spills 40% faster, reduce soot by 40%, and eliminate 95% of toxic residue—offering a cleaner alternative to traditional in-situ burning, which leaves harmful tar mats and smoke plumes. The breakthrough could revolutionize oil spill response by turning fire into a tool for environmental protection.

Italy confiscates €200M in assets linked to late Sicilian mafia boss

Italian authorities seized €200M in assets linked to late Sicilian mafia boss Matteo Messina Denaro, spanning multiple countries and targeting drug trafficking networks. The operation highlights global efforts to disrupt Cosa Nostra's financial reach, though experts note challenges in fully dismantling the organization's decentralized structure.

Iran Lifts Internet Blackout, Restrictions Remain

Iran lifts 88-day internet blackout, but access remains limited at 50% of pre-shutdown levels under President Masoud Pezeshkian’s 'pro-internet' policy, which prioritizes paid access over free expression, amid ongoing censorship and geopolitical tensions under President Trump’s administration.