Inside DeepMind's AlphaEvolve and the AI co-scientist
AlphaEvolve discovers entirely new algorithms and solves mathematical proofs that have been stalled for decades. What's next for AI science?
This week we talk to Pushmeet Kohli, leader of AI for Science at DeepMind. His team has created AlphaEvolve, an AI system that discovers entirely new algorithms and proves mathematical results that have eluded researchers for decades. From improving 50-year-old matrix multiplication algorithms to generating interpretable code for complex problems like data center scheduling, AlphaEvolve represents a new paradigm where LLMs coupled with evolutionary search can outperform human experts. Pushmeet explains the technical architecture behind these breakthroughs and shares insights from collaborations with mathematicians like Terence Tao, while discussing how AI is accelerating scientific discovery across domains from chip design to materials science.
To dig deeper into what’s really happening in AI for science along with the backstory of AlphaEvolve and where this all might go next, here’s a bonus essay to enjoy.
AlphaEvolve: The Next Leap in AI-Powered Scientific Discovery
Redefining scientific discovery with AI that codes, experiments, and innovates beyond human imagination.
Jul 17, 2025
Post methodology: @GPT-4 via Dust: write an essay explaining what AlphaEvolve is, how it evolved from earlier research efforts, the scientific problems it can help solve, and its significance in incorporating LLMs with program search for the future of AI.; Use these links as primary reference [transcript of and papers mentioned in Pushmeet Kohli’s Training Data episode]. Light editing and reformatting for Substack editor.
The quest to automate scientific discovery has long been central to AI. From AlphaGo’s superhuman moves in board games to AlphaFold’s revolution in biology, each milestone has inched AI closer to the role of a "co-scientist." The latest leap is AlphaEvolve, a Gemini-powered coding agent by Google DeepMind, designed to discover entirely new algorithms at a level previously unattainable by either humans or machines. AlphaEvolve’s novel approach is combining large language models (LLMs) with program search—a shift that may define the next era of computational science and engineering.
What is AlphaEvolve?
AlphaEvolve is an AI agent that pairs large language models—specifically, Google’s Gemini family—with evolutionary program search to automatically generate, evaluate, and evolve novel algorithms. Its core innovation lies in combining the creative generative ability of LLMs with rigorous automated evaluation, forming a loop that mirrors the scientific method: propose, test, refine, and repeat. Unlike previous models that merely completed code snippets or followed templates, AlphaEvolve can operate over entire programs, in multiple languages, and across broad scientific domains.
The system uses Gemini Flash and Pro models to rapidly generate candidate programs, while an automated evaluator tests their correctness and efficacy. The best-performing programs are then mutated and recombined, in a process inspired by biological evolution, to discover increasingly effective solutions (see AlphaEvolve Paper).
Evolution from Earlier Research Efforts
AlphaEvolve represents the culmination of several research threads at DeepMind:
AlphaGo and AlphaZero: Early agents applied reinforcement learning to surpass human expertise in games, showing that AI could master complex decision spaces.
AlphaFold: Brought AI into biology, predicting protein structures with unprecedented accuracy, democratizing and accelerating an entire field.
AlphaTensor: Extended the approach to algorithm discovery, notably finding new matrix multiplication algorithms after 50 years of stagnation.
FunSearch: The direct precursor to AlphaEvolve, FunSearch used an LLM plus an evaluator to discover new mathematical algorithms, but was limited to filling in small function templates and required large numbers of evaluations.
Pushmeet Kohli, leader of AI for Science at DeepMind, explains that FunSearch was a “first instantiation” of LLM-guided algorithm discovery, but AlphaEvolve removes many limitations: it searches entire programs, operates over multiple languages (C++, Python, Verilog), requires far fewer evaluations, and works in broader domains.
See full episode:
The architecture also draws on the co-scientist system, where multiple LLM agents play different scientific roles (hypothesis generator, critic, reviewer) and collaborate to refine ideas. This multi-agent setup further amplifies the creative and evaluative capabilities of AI (see AI co-scientist Paper).
Scientific Problems AlphaEvolve Can Help Solve
AlphaEvolve’s breadth and depth enable it to tackle a wide range of scientific and engineering challenges:
Algorithm Discovery: It has already found faster and more efficient algorithms for classic computer science problems, some of which were previously unsolved for decades.
Mathematical Conjectures: By searching program space, AlphaEvolve has led to new mathematical insights and proofs, sometimes revealing patterns and symmetries overlooked by human experts.
Hardware Design: Capable of generating Verilog code, AlphaEvolve can optimize digital circuits and chip architectures, potentially transforming hardware design and manufacturing.
Data Center Scheduling: By generating interpretable code for scheduling jobs, it improves efficiency while maintaining transparency—unlike pure neural network controllers.
Material Science & Beyond: Whenever an automated evaluator is available (e.g., for simulating material properties), AlphaEvolve can optimize solutions, suggesting applications in chemistry, physics, and engineering.
Pushmeet Kohli emphasizes that the only strict requirement is a trustworthy evaluation function: “Wherever you can find an evaluator where you can say, ‘I really trust this evaluation scheme…’ you can use AlphaEvolve.” Human evaluators could be incorporated for subjective criteria like elegance or simplicity, though most current use cases rely on programmatic evaluation.
The Significance of LLMs + Program Search for the Future of AI
AlphaEvolve’s approach—pairing LLMs with program search and automated evaluation—marks a paradigm shift in AI for science:
Interpretability: Unlike neural policies, which are black boxes, AlphaEvolve produces explicit, human-readable code, making insights accessible and verifiable.
Generality & Scalability: Its multi-language, multi-domain capabilities mean it can be deployed across computing, engineering, and scientific disciplines, limited only by the availability of robust evaluators.
Mimicking the Scientific Method: The loop of generating, testing, and evolving ideas closely parallels human scientific discovery, but with tireless speed and creativity.
Democratization: As with AlphaFold, such tools can make advanced problem-solving accessible to scientists worldwide, regardless of resources.
Acceleration of Discovery: By automating the search for novel solutions, AlphaEvolve can help overcome bottlenecks in research, as seen in protein structure prediction or chip design, triggering new waves of innovation.
As Pushmeet notes, “We are already in that era of AI-accelerated scientific discovery.” The architecture of generator (LLM) and verifier (evaluation function), possibly expanded to multi-agent setups, is becoming the consensus for building powerful scientific AI agents. This holds promise for breakthroughs not just in mathematics and computing, but across the entire spectrum of science and engineering.
As these agents become more capable and accessible, they promise to accelerate, democratize, and even transform the very process of discovery itself. As Pushmeet explains, “Once you have these agents which can go beyond human abilities in solving these problems, then the question becomes: Which problems do we solve next?” The future of science may well be defined by how we answer that question—together, with our AI co-scientists.