AlphaFold vs. Evo: Two AI Paradigms Tackle Biology
Isomorphic Labs and Arc Institute have contrasting and complementing visions of how to use AI for biology and drug development. How do they compare?
Post methodology: Gemini 2.5 Pro with the prompts: 1) compare deepmind's alphafold and the arc institute's evo based on these two interview transcripts plus available public sources 2) can you do a write through of this content into an essay appropriate for an ai founder substack audience? And light human edits.The intersection of AI and biology is transforming from an academic endeavor into a dynamic frontier, ripe with innovation and discovery. We now stand at the threshold of a new era where AI systems don't merely recognize patterns in biological data but actively decode, predict, and design the fundamental components of life itself. Two groundbreaking platforms are at the center of this revolution: AlphaFold3 from Isomorphic Labs (spun out from DeepMind), and Evo 2 from Arc Institute.
Following AlphaFold's Nobel Prize and the recent launch of its powerful successor AlphaFold3, alongside Arc Institute's ambitious Evo 2 platform, we are witnessing two distinct technological approaches to biological complexity. Though both leverage cutting-edge AI, they represent fundamentally different philosophies—complementary approaches that together may unlock unprecedented capabilities in drug discovery, protein engineering and our fundamental understanding of living systems. For scientists and entrepreneurs building at this frontier, understanding the convergence of these approaches reveals not just technological strategy, but a roadmap to the future of medicine and biotechnology.
Max Jaderberg, Chief AI Officer of Isomorphic Labs, and Patrick Hsu, co-founder of Arc Institute, were both guests on Training Data, where they articulated their respective approaches. This post is about how their work contrasts, and the potential of combining complementary approaches.
AlphaFold 3: Mastering the Geometry of Life
DeepMind, and its sibling company Isomorphic Labs, are betting heavily on structure. AlphaFold's core mission, dramatically realized with AlphaFold 2 and expanded with AlphaFold 3, is to predict the precise 3D shape of how life's molecules – proteins, DNA, RNA, and crucial small molecules (ligands) – fit together.
Think of it as solving a dynamic, atomic-level 3D puzzle. As Max Jaderberg (leading Isomorphic's AI efforts) described, the goal is a "general drug design engine." AlphaFold 3 tackles this using a sophisticated diffusion-based architecture, directly generating the 3D coordinates of atoms in complex interactions. This is crucial for understanding how a potential drug physically docks with its target protein, a cornerstone of modern drug discovery.
Their training data? Decades of hard-won experimental structures from the Protein Data Bank (PDB). It's a strategy built on high-quality, curated, but relatively scarce (compared to sequence data) structural information. Their release strategy reflects this focus: the AlphaFold Server is open for academic use, fostering basic research, while the core technology underpins Isomorphic Labs' commercial partnerships, aiming to translate structural predictions into therapeutics. Jaderberg sees AlphaFold as just one vital piece – needing perhaps "half a dozen" more breakthroughs of similar magnitude to fully realize their drug design vision.
Related: Max Jaderberg of Isomorphic Labs on Sequoia’s Training Data podcast
Evo: Reading the Genome's Evolutionary Playbook
The Arc Institute, alongside collaborators at Stanford and NVIDIA, takes a different tack with Evo. Inspired by evolution as biology's unifying force, Evo treats the genome itself as the fundamental information layer. As Arc's Patrick Hsu puts it, Evo aims to connect biological sequence directly to biological function.
Think of it like an LLM for DNA. Evo is trained on vast swathes of public genomic data – trillions of bases from hundreds of thousands of organisms in the Sequence Read Archive (SRA). Using architectures like StripedHyena, optimized for extremely long contexts (hundreds of thousands, even millions of bases), Evo learns the patterns and "grammar" written into DNA by eons of evolution.
Its capabilities reflect this sequence-first approach: predicting the functional impact of genetic mutations (even subtle ones in non-coding regions), interpreting "variants of unknown significance," and even generating novel DNA sequences with desired properties – think designing entirely new CRISPR systems or gene regulatory networks. Evo isn't primarily focused on the static 3D shape; it's about the functional code embedded within the linear sequence. Arc's philosophy is reflected in its open-source release of Evo, aiming to provide a foundational tool to accelerate discovery across the entire scientific community.
Related: Patrick Hsu of Arc Institute on Sequoia’s Training Data podcast
Divergent Architectures, Complementary Insights
The contrast is stark:
Goal: AlphaFold seeks structural understanding for targeted intervention (drugs); Evo seeks functional understanding from the genome's "language."
Architecture: AlphaFold's diffusion models excel at 3D coordinates; Evo's long-context sequence models excel at genomic scale patterns.
Data: AlphaFold uses curated structural data; Evo uses massive, less structured sequence data.
Strategy: Isomorphic targets a specific vertical (drug design) with controlled access; Arc builds an open, foundational capability for broad use.
Yet, they are incredibly complementary. Imagine Evo generating the sequence for a novel enzyme, and AlphaFold predicting its 3D structure to guide experimental validation. The future likely involves tight integration between these sequence-level and structure-level views.
Lessons for AI Builders
For founders navigating the AI landscape, especially in complex scientific domains, the AlphaFold vs. Evo story holds key takeaways:
No One-Size-Fits-All Architecture: The problem dictates the architecture. Modeling physical 3D space (AlphaFold) requires different tools than modeling long-range dependencies in 1D sequences (Evo). Tailor your tech stack relentlessly.
Data Strategy is Paramount: Whether leveraging curated, high-value datasets (PDB) or vast, raw repositories (SRA), your approach to acquiring, cleaning, and utilizing data defines your model's potential. Jaderberg's insight that much biological data wasn't created for ML highlights the opportunity in generating ML-native datasets.
Foundational Models are Expanding: Biology is proving fertile ground for foundational models beyond text and images. Identify the core "language" or representation of your domain and consider if a self-supervised, large-scale model can unlock new capabilities.
Open Source vs. Commercial Moat: The strategic decision of how to release powerful models has profound implications for adoption, community building, and value capture. Arc bets on ubiquity; Isomorphic bets on focused application.
Talent is Interdisciplinary: Both Jaderberg and Hsu emphasize the critical need (and scarcity) of talent fluent in both AI/ML and the specific scientific domain. Building truly integrated teams is non-negotiable.
The Road Ahead
AlphaFold and Evo are landmark achievements, but they represent the early innings of AI's potential impact on biology. We're moving from understanding static snapshots (AlphaFold) and linear codes (Evo) towards modeling dynamic processes, cellular systems, and maybe even entire virtual organisms.
Whether the path lies primarily through structure, sequence, or a fusion of both, these two pioneering efforts demonstrate the power of ambitious AI applied to science's deepest questions. For AI founders, they offer not just inspiration, but concrete examples of how strategic choices in architecture, data, and philosophy can shape the future. The race to decode, and perhaps one day program, life itself is well and truly on.

