Post methodology: Claude 4.0 via custom Dust assistant @TDep-SubstackPost with the system prompt: Please read the text of the podcast transcript in the prompt and write a short post that summarizes the main points and incorporates any recent news articles that provide helpful context for the interview. Please make the post as concise as possible and avoid academic language or footnotes. put any linked articles inline in the text. refer to Podcast guests by their first names after the initial mention. Light editing and reformatting for the Substack editor.
Pushmeet Kohli, VP of Research at Google DeepMind, believes we're already living through the era of AI-accelerated scientific discovery—we just don't realize it yet. In this episode, Pushmeet details how DeepMind's latest breakthrough, AlphaEvolve, represents a potential watershed moment where AI systems can autonomously discover new algorithms and prove mathematical results that have stumped researchers for decades.
The key innovation isn't just throwing more compute at problems. It's what Pushmeet calls "the harness"—coupling large language models with sophisticated evaluation systems that can distinguish between brilliant insights and meaningless hallucinations. This is part of the AI version of the scientific method: generate hypotheses, test them rigorously, refine the results.
From Templates to True Discovery
Alpha Evolve builds on DeepMind's earlier FunSearch system, but removes a crucial limitation. While FunSearch required researchers to provide templates for where algorithms should be discovered, Alpha Evolve can search entire algorithmic spaces across multiple programming languages—Python, C++, even Verilog for chip design. It's not just completing small functions; it's discovering whole new approaches to hard problems.
The results speak for themselves. Working with mathematician Terence Tao and others, the system has uncovered new mathematical insights, including previously unrecognized symmetries in complex problems like the cap set problem. When experts examined the AI-generated solutions, they could extract genuine mathematical insights that had been hiding in plain sight.
Multi-Agent Science
Perhaps more intriguingly, Google's co-scientist system shows how multiple AI agents can collaborate like a research team. Different instances of Gemini play distinct roles—hypothesis generator, critic, editor, reviewer—working together in shared memory. Pushmeet notes that this multi-agent approach produces results far beyond what any single model achieves, with the quality improving dramatically over days of computational refinement.
The intuition? AI models are often better at evaluating solutions than generating them from scratch. It's the same phenomenon we see in computer science: it’s easier to know if a solution is correct then to find it initially.
The Interpretability Advantage
Unlike black-box neural networks, Alpha Evolve produces human-readable code. This matters enormously for real-world deployment. As Pushmeet explains, Google's engineers would much rather debug interpretable algorithms than troubleshoot a neural network making data center scheduling decisions. When systems break, you need to understand why.
This echoes the AlphaFold story. Before 2021, determining a single protein structure could take years and cost $1 million. Now it's done in seconds. But AlphaFold succeeded not just because of accuracy—it also clearly communicates its confidence levels, telling researchers when to trust its predictions and when to be cautious.
What's Next
The bottlenecks going forward aren't computational—they're about validation and accessibility. How do you bridge digital discoveries to real-world validation? How do you make these tools usable by domain experts who aren't AI researchers?
Pushmeet believes AI will accelerate everything from materials science to energy research to healthcare. If we can discover room-temperature superconductors or unlock fusion, the geopolitical and economic implications are staggering. The question isn't whether AI will transform science—it's which domains will be transformed first.
For founders, the lesson is clear: the future belongs to systems that combine powerful generation with robust evaluation, embrace multi-agent collaboration, and prioritize interpretable outputs over black-box performance. The age of AI scientists isn't coming—it's already here.
Hosted by Sonya Huang and Pat Grady
Mentioned in this episode:
AlphaEvolve: DeepMind coding agent that designs scientific algorithms, powered by Gemini models
AlphaFold 2: Breakthrough protein structure model that won the Nobel Prize
FunSearch: More structured predecessor to AlphaEvolve
AI co-scientist: Google multi-agent AI system to be a virtual scientific collaborator
AlphaTensor: DeepMind model that found a better matrix multiplication algorithm
Cap set problem: Math challenge that Terrance Tao describes as, “perhaps, my favorite open problem.”
Strassen algorithm: Long-standing matrix multiplication solution that FunSearch beat
Wake Sleep algorithm: Reference to DreamCoder: Growing generalizable, interpretable knowledge with wake-sleep Bayesian program learning by Ellis and Tenenbaum at MIT
Share this post