AlphaFold 2: Highly Accurate Protein Structure Prediction with AlphaFold

Jumper, Evans, Pritzel et al. — DeepMind — Nature, July 2021

TL;DR: AlphaFold 2 predicts 3D protein structures from amino acid sequences at near-experimental accuracy, solving a 50-year grand challenge in biology. It dominated CASP14 with a median GDT of 92.4 (next best: ~75), predicted structures for 200+ million proteins, and earned the 2024 Nobel Prize in Chemistry for Demis Hassabis and John Jumper. The key insight: treat protein structure prediction as attention over evolutionary data, not as physics simulation.

Level 1 — Beginner

▼

What is protein folding?

Proteins are molecular machines that do almost everything in your body — digest food, fight infections, carry oxygen, read DNA. Every protein starts as a chain of amino acids (think: a string of 20 different colored beads). There are 20 types of amino acids, and a typical protein has 100–1,000 of them strung together.

This chain doesn’t stay flat. Within milliseconds, it folds into a specific 3D shape. That shape determines what the protein does. Get the shape wrong, and the protein malfunctions — causing diseases like Alzheimer’s, cancer, and cystic fibrosis.

The folding problem

Given only the sequence of amino acids (the “string of beads”), predict the final 3D shape. This has been biology’s hardest problem for 50 years. Experimental methods (X-ray crystallography, cryo-EM) cost $50K–$100K per protein and take months to years. There are ~200 million known protein sequences but only ~170,000 experimentally solved structures.

The CASP competition

CASP (Critical Assessment of Structure Prediction) is the Olympics of protein structure prediction, held every two years since 1994. Labs around the world try to predict structures for proteins whose real structures have been solved but not yet published. The metric is GDT (Global Distance Test): 0–100, where >90 is considered “experimental quality.”

92.4

AlphaFold 2
median GDT

~75

Next best
CASP14 entry

~60

Best before
AlphaFold era

AlphaFold 2 didn’t just win — it leapt past the experimental-quality threshold that the field thought was years away.

How does AlphaFold 2 work? (The intuition)

AlphaFold 2 has three key insights:

Insight 1: Evolution is the teacher

Your protein sequence has cousins across millions of species. By aligning these related sequences (a Multiple Sequence Alignment or MSA), you can spot patterns: “When position 10 changes, position 50 always changes too.” This co-evolution signal implies those positions are physically close in 3D — they need to change together to keep the protein functional.

Sequence 1:  A L G V D K ...   (human)
Sequence 2:  A L D V D K ...   (mouse)
Sequence 3:  S L D I E K ...   (fish)
Sequence 4:  A M G V D K ...   (bird)

Positions 3 and 5 co-vary:
  G↔D, D↔D, D↔E, G↔D
  → These positions are likely close in 3D space

MSA is purely linear sequence data — no 3D information. But the co-evolutionary patterns hidden within it encode structural information. AlphaFold 2’s job is to decode that signal.

Insight 2: Think in pairs, not just positions

Traditional methods looked at each amino acid independently. AlphaFold 2 maintains a pair representation — a matrix tracking what every residue “knows” about every other residue. This is like having a giant spreadsheet where row 10, column 50 says “these two residues are probably 5Å apart and co-evolved strongly.”

Insight 3: Start with a “residue gas” and let it condense

Instead of trying to fold a chain step by step, AlphaFold 2 starts with all residues floating freely in space (a “gas”) and gradually moves them into their correct positions. Each residue is represented as a rigid frame (position + orientation), and the network iteratively refines all frames simultaneously.

The architecture in one picture

Amino acid sequence
       ↓
  Database search (JackHMMER / HHblits)
       ↓
  Multiple Sequence Alignment (MSA)
       ↓
  ┌─────────────────────────┐
  │      EVOFORMER           │  ← 48 blocks of attention
  │  MSA repr ↔ Pair repr   │     (the core innovation)
  │  (rows × cols attention)  │
  └─────────────────────────┘
       ↓
  ┌─────────────────────────┐
  │   STRUCTURE MODULE       │  ← Invariant Point Attention
  │   “Residue gas” → 3D   │     (rotations/translations)
  └─────────────────────────┘
       ↓
  3D protein structure + confidence (pLDDT)

What is pLDDT?

AlphaFold 2 doesn’t just predict a structure — it tells you how confident it is for each residue. The predicted Local Distance Difference Test (pLDDT) ranges from 0 to 100:

pLDDT	Meaning
> 90	Very high confidence — trust this prediction
70–90	Good confidence — backbone reliable, some side-chain uncertainty
50–70	Low confidence — treat with caution
< 50	Very low — may be intrinsically disordered (no fixed structure)

Key takeaway

AlphaFold 2 solved the protein folding problem by treating it as an information extraction problem — mining co-evolutionary signals from millions of related sequences — rather than a physics simulation. The key was building the right attention architecture (Evoformer) to decode that evolutionary signal into 3D coordinates.

Quiz — Level 1

1. AlphaFold 2 uses Multiple Sequence Alignments (MSAs) as a primary input. What kind of information do MSAs contain?

MSAs are purely linear sequence data — rows of amino acid letters aligned across species. They contain no 3D information whatsoever. The co-evolutionary patterns in these alignments are what AlphaFold 2 uses to infer 3D structure.

2. Co-evolution in an MSA reveals that when position 10 mutates, position 50 always mutates too. What does this pattern most likely indicate?

Co-evolution means compensatory mutations: when one position changes, another must also change to maintain the physical contact between them. This is the primary signal AlphaFold 2 uses to infer which residues are spatially close.

3. The Structure Module starts with residues as a “gas” floating freely in space. Why does AlphaFold 2 use this approach instead of folding the chain step by step?

Sequential folding accumulates errors — if an early step is wrong, everything downstream breaks. The residue gas approach lets the network refine all positions at once, using global context from the Evoformer to guide each residue toward its correct location.

4. A researcher gets an AlphaFold 2 prediction with pLDDT of 45 for a stretch of 30 residues. What is the most reasonable interpretation?

pLDDT below 50 means very low confidence. This often indicates the region is intrinsically disordered (genuinely has no fixed structure), or that AlphaFold 2 lacks enough evolutionary data to predict it reliably.

5. Before AlphaFold 2, determining a single protein’s structure experimentally cost $50K–$100K and took months. What was AlphaFold 2’s most transformative impact?

AlphaFold 2 didn’t replace experiments (they’re still needed for validation and dynamics), but it democratized structural knowledge. The AlphaFold Protein Structure Database covers virtually all known protein sequences, turning structure prediction from a bottleneck into a commodity.

Level 2 — Intermediate

▼

The Evoformer: where the magic happens

The Evoformer is a stack of 48 identical blocks, each updating two representations simultaneously:

Representation	Shape	What it encodes
MSA representation	N_seq × N_res × 256	Per-sequence, per-position features — what each sequence “knows” about each position
Pair representation	N_res × N_res × 128	Pairwise relationship between every residue pair — distance, orientation, co-evolution signals

These two representations talk to each other every block through specific information pathways:

Outer product mean: MSA → Pair

This is the mechanism that converts evolutionary information into pairwise structural information:

For each pair of residue positions (i, j):
  1. Take column i from the MSA representation  (N_seq vectors)
  2. Take column j from the MSA representation  (N_seq vectors)
  3. Compute outer product of each pair of vectors
  4. Average across all sequences
  5. Project to update pair[i][j]

Intuition: “What do all the sequences collectively say
about the relationship between position i and position j?”

This is where co-evolution gets directly injected into the pair representation. If positions i and j co-evolve strongly, their MSA columns will have correlated patterns that produce a distinctive outer product signature.

Triangle multiplicative updates: enforcing geometric consistency

The pair representation must obey the triangle inequality: if residue A is close to B, and B is close to C, then A must be somewhat close to C. Standard attention doesn’t enforce this. Triangle updates do:

To update pair(i, j), consider ALL intermediate residues k:

“Outgoing edges”:  pair(i,k) × pair(j,k)  →  update pair(i,j)
“Incoming edges”:  pair(k,i) × pair(k,j)  →  update pair(i,j)

Intuition: “What does the rest of the protein tell me
about the relationship between i and j?”

   i ─── j
    \   /
     \ /
      k    ← intermediate residue provides geometric constraint

This is computationally O(N³) per block — for each pair (i,j), you sum over all k — which is expensive but essential for geometric consistency.

Template embedding

When homologous structures exist in PDB, AlphaFold 2 can use them as templates. Template features (backbone distances, torsion angles) are projected and added as a bias to the pair representation. Two of the five models use templates; three don’t.

Why not always use templates?

Templates from close homologs are very helpful. But templates from distant homologs can actually mislead the network, anchoring it to an incorrect fold. The template-free models avoid this risk entirely. Omitting templates also prevents the model from becoming dependent on template availability and helps when the target has a truly novel fold.

The FAPE loss: learning in 3D

The Frame Aligned Point Error (FAPE) is AlphaFold 2’s primary loss function, measuring structural accuracy in a rotation/translation invariant way:

For each pair of residues (i, j):
  1. Look at the predicted structure from residue i’s reference frame
  2. Look at the true structure from residue i’s reference frame
  3. Compute the distance between predicted and true position of j
  4. Average over all (i, j) pairs

Why frames? Two structures can be identical but rotated
differently in space. FAPE compares local geometry,
not global orientation, making it invariant to rigid-body
transformations.

Recycling: iterative refinement

AlphaFold 2 runs the entire Evoformer + Structure Module pipeline three times, feeding the output of each cycle back as input to the next:

Cycle 1: MSA + pair repr → Evoformer → Structure → 3D coords (draft 1)
              ↑                                              │
              └──────────── feed back pair + coords ─────────┘
Cycle 2: improved input → Evoformer → Structure → 3D coords (draft 2)
              ↑                                              │
              └──────────── feed back pair + coords ─────────┘
Cycle 3: further refined → Evoformer → Structure → FINAL structure

Each cycle refines the structure. Loss is computed only on the final cycle, but gradients flow through all three via shared weights.

Key takeaway

The Evoformer’s genius is the bidirectional information flow between MSA and pair representations. The outer product mean converts evolutionary signals to structural signals; triangle updates enforce geometric consistency; and recycling lets the network iteratively refine its predictions across multiple passes.

Quiz — Level 2

1. The outer product mean operation converts MSA information into pair information. What specific role does it play?

The outer product mean takes MSA columns for positions i and j, computes their outer product, and averages across sequences. This directly converts co-evolutionary correlations in the MSA into pairwise structural features.

2. Triangle multiplicative updates enforce geometric consistency in the pair representation. How do they update pair(i, j)?

Triangle updates use all intermediate residues k to update pair(i,j), enforcing the triangle inequality: if A is close to B, and B is close to C, the network must keep A–C distances consistent. This is O(N³) per block.

3. AlphaFold 2 initializes the Structure Module with residues as a “gas” (all at origin with identity orientations) rather than using template coordinates. What advantage does this provide?

Starting from a uniform “gas” means the network must derive the structure entirely from the Evoformer’s learned representations, avoiding any bias from potentially misleading template coordinates.

4. pLDDT scores correlate well with actual model quality. A region with consistently very high pLDDT (>90) means:

pLDDT >90 indicates very high confidence, and crucially, this confidence is well-calibrated — high pLDDT genuinely corresponds to high prediction accuracy. This makes pLDDT a reliable quality indicator for downstream use.

5. Why did conventional CNNs struggle with protein structure prediction compared to AlphaFold 2’s attention-based Evoformer?

Protein contacts can be hundreds of residues apart in sequence but adjacent in 3D. CNNs need many stacked layers to grow their receptive field that far. Attention relates any two positions in a single operation, making it naturally suited for long-range structural contacts.

Level 3 — Expert

▼

Training curriculum: a three-stage process

AlphaFold 2 was trained in carefully staged phases:

Stage	Crop Size	Details
1. Initial training	256 residues	~170K PDB structures (clustered at 40% seq identity), 128-seq MSA clusters, ~300K steps. Learn basic fold recognition and attention patterns.
2. Fine-tuning	384 residues	Larger crops, structure violation loss added. Handle longer proteins, enforce physical constraints.
3. Self-distillation	384 residues	“Noisy student”: use the trained model to predict structures for sequences with no experimental data, then retrain on real + predicted structures.

Self-distillation safeguard

Only high-confidence predictions (pLDDT > 70) were used as pseudo-labels. Low-confidence predictions were discarded to prevent training on garbage. This massively expanded the effective training set beyond the ~170K PDB structures to millions of protein sequences.

The full loss function

AlphaFold 2’s training uses six loss terms working together:

Loss	What It Teaches	Detail
FAPE backbone	Global fold accuracy	Frame Aligned Point Error on Cα atoms; clamped at 10Å to prevent outlier residues from dominating gradients
FAPE sidechain	Local rotamer accuracy	Same metric on all-atom positions, using side-chain reference frames
Distogram	Pairwise distances	Cross-entropy on binned Cα–Cα distances (64 bins, 2–22Å); regularizer for pair representation
Masked MSA	Evolutionary understanding	BERT-like: mask 15% of MSA positions, predict the amino acids. Forces genuine evolutionary pattern learning.
Violation	Physical realism	Penalizes bond length/angle violations, steric clashes, chain breaks. Added only in Stage 2.
Experimentally resolved	Confidence calibration	Per-residue prediction of whether each atom has experimental coordinates in the PDB entry.

FAPE clamping: a subtle design choice

FAPE_clamped = min(FAPE_raw, 10Å)

Without clamping:
  One badly predicted residue 50Å away generates huge loss
  → Gradient dominated by one outlier
  → Network optimizes that residue at expense of everything else

With clamping at 10Å:
  Outliers contribute at most 10Å of loss each
  → Balanced gradients across all residues
  → Network improves overall structure, not just worst cases

This is analogous to Huber loss in regression — robust to outliers while still penalizing errors.

Masked MSA prediction: the BERT connection

Original MSA:     A  L  G  V  D
                  A  L  D  V  D
                  S  L  D  I  E
                  A  M  G  V  D

Masked (15%):     A  L  [M] V  D
                  A  [M] D  V  [M]
                  S  L  [M] I  E
                  A  M  G  [M] D

Task: predict masked amino acids from context

This BERT-like auxiliary loss forces the Evoformer to genuinely understand evolutionary patterns rather than simply passing MSA features through without processing them.

The five-model ensemble

AlphaFold 2 trains five separate models with different configurations:

Models 1 & 2: Use template structures from PDB as additional pair bias
Models 3, 4, 5: Template-free, rely entirely on MSA + learned patterns

All five are run independently and the highest-confidence prediction (ranked by pLDDT) wins.

MSA depth sensitivity

MSA Depth vs Accuracy:
  >1000 sequences:    Median GDT > 90  (near-experimental)
  100–1000:           Median GDT 70–85  (good, some details wrong)
  30–100:             Median GDT 50–70  (rough fold)
  <30:                Often fails completely
  Single sequence:    Near-random for most proteins

Co-evolution is the primary signal. With few sequences, the outer product mean has no data to extract — the pair representation stays uninformative. Orphan proteins (~10–15% of known families) remain AlphaFold 2’s biggest failure mode.

The “sophisticated fold recognition” argument

Skolnick (2021) argued AF2 is fundamentally a very sophisticated fold recognition algorithm: the library of single-domain protein folds in PDB is essentially complete — all possible domain topologies are already represented. AF2 has learned to map any sequence to the correct existing fold, then refine local details. This explains both its success (single domains) and its limitations (truly novel folds).

Known limitations

Limitation	Detail
Intrinsically disordered regions	~30% of human proteome has no fixed 3D structure. AF2 predicts one arbitrary conformation with low pLDDT, but cannot distinguish “genuinely disordered” from “insufficient data.”
Conformational states	Proteins often switch between states (e.g., active/inactive kinase). AF2 predicts the single dominant conformation in PDB training data. Alternative states are invisible.
Protein complexes	Designed for single chains. Multi-chain prediction requires AlphaFold-Multimer (2021) or AlphaFold 3 (2024).
Mutations & stability	Wild-type and mutant sequences often produce identical structures. AF2 is not a thermodynamic stability predictor.

Computational complexity

Pair representation:  N_res × N_res × 128  (quadratic in protein length)
Triangle attention:   O(N³) per Evoformer block
48 blocks × 3 cycles: ~144 forward passes through attention stack

~1000 residues: ~16 GB GPU, minutes
~2000 residues: ~64 GB GPU, hours
>2500 residues: typically split into domains and predicted separately

Key takeaway

AlphaFold 2’s training is a masterclass in engineering: staged curriculum, robust loss functions with FAPE clamping, BERT-like auxiliary losses for representation quality, self-distillation for data augmentation, and a five-model ensemble for robustness. The system’s accuracy depends critically on MSA depth, and its fundamental limitation is predicting static snapshots of dynamic proteins.

Quiz — Level 3

1. AlphaFold 2’s training uses a BERT-like masked MSA prediction loss alongside the structural FAPE loss. What does the masked MSA loss specifically enforce?

The masked MSA loss is a BERT-style self-supervised task: randomly mask 15% of MSA positions and predict them. This forces the Evoformer to learn genuine evolutionary patterns, not just pass features through.

2. AlphaFold 2’s self-distillation stage uses the trained model to predict structures for sequences without experimental data, then retrains on those predictions. What safeguard prevents error amplification?

The confidence threshold (pLDDT > 70) acts as a quality filter. Only predictions the model is already confident about become pseudo-labels, preventing low-quality predictions from corrupting the training set.

3. AlphaFold 2 trains five separate models — two with templates and three without. Why not always use templates?

Distant-homolog templates can actually hurt performance by anchoring the network to incorrect structural hypotheses. Template-free models avoid this risk entirely and are better for novel folds.

4. Skolnick (2021) argues AlphaFold 2 is fundamentally a “sophisticated fold recognition algorithm.” What evidence supports this?

Skolnick’s argument: the library of single-domain folds is complete — all topologies are already in PDB. AF2 has learned a sophisticated mapping from sequence to the correct existing fold, then refines local structure. This explains its strength (known folds) and weakness (truly novel topologies).

5. A researcher predicts a kinase structure with pLDDT > 90 everywhere, then discovers the kinase has both active and inactive conformations. What is the most accurate assessment?

AF2 predicts the single dominant conformation in PDB training data. Since ~80% of kinase structures are inactive, it likely predicts the inactive state with high confidence. The active state is invisible — high pLDDT reflects confidence in one state, not awareness of all states.

Level 4 — Frontier

▼

AlphaFold 3: the paradigm shift

AlphaFold 3 (May 2024, Nature) isn’t an incremental update — it’s a fundamental architectural redesign co-developed by DeepMind and Isomorphic Labs.

Dimension	AlphaFold 2 (2020)	AlphaFold 3 (2024)
Scope	Single protein chains	Proteins + DNA + RNA + ligands + ions
Input tokens	Per-residue (one token = one residue)	Per-atom (every atom is a token)
Trunk	Evoformer (MSA + pair repr)	Pairformer (pair repr only; MSA processed separately upstream)
Structure module	IPA — deterministic, one output	Diffusion — denoises from random noise, can sample multiple structures
Confidence	pLDDT + pTM	pLDDT + pTM + PAE + pDE (interface distance error)

Why diffusion? The key architectural insight

AlphaFold 2 Structure Module:
  Input: pair repr → IPA → ONE deterministic output
  Problem: one input → one structure, no structural uncertainty

AlphaFold 3 Diffusion Module:
  Input: pair repr + NOISE → denoise → predicted coordinates
  
  Different noise seeds → different structures
  Run 5 times → 5 candidates → rank by confidence
  
  Same paradigm shift as deterministic image encoders → 
  Stable Diffusion: one input, many possible outputs

What AF3 gained and lost

Gained: Protein-ligand docking (50% better than prior best), protein-nucleic acid complexes, antibody-antigen interactions, multiple structure samples per input.

Lost: Single-chain protein accuracy slightly worse than AF2 (traded monomer accuracy for generality); hallucination risk from diffusion; ~4.4% chirality errors in predicted ligand poses.

ESMFold: the language model approach (Meta FAIR)

While DeepMind built AF2 around MSA + co-evolution, Meta’s FAIR team asked: what if a protein language model already encodes structural information, and you don’t need MSA at all?

AlphaFold 2:  Sequence → MSA (minutes-hours) → Evoformer → Structure
ESMFold:      Sequence → ESM-2 (15B params) → Structure Module → 3D

ESMFold: no MSA, no database search, just the raw sequence
         ~60× faster than AlphaFold 2

ESM-2 was trained on 250M protein sequences with masked language modeling (exactly like BERT). During training, it implicitly learns co-evolutionary patterns, structural motifs, and long-range contacts.

Model	Monomer Accuracy	Speed	Orphan Proteins
AlphaFold 2	88%	Minutes–hours	Fails (needs MSA)
ESMFold	76%	Seconds	Works (no MSA needed)

ESMFold predicted 617 million metagenomic protein structures (the ESM Metagenomic Atlas) — structures for proteins that have no homologs in any database.

The inverse problem: from prediction to design

AlphaFold solves the forward problem: sequence → structure. The more valuable problem is the inverse: design a sequence that folds into a desired structure.

RFdiffusion (Baker Lab, 2023)

David Baker’s lab adapted diffusion models for protein backbone design:

1. Start with random noise in 3D coordinate space
2. Denoise using a fine-tuned RoseTTAFold network → novel backbone
3. Use ProteinMPNN to design a sequence for that backbone
4. Use AlphaFold 2 to VERIFY the sequence folds correctly
5. Synthesize in lab

Applications demonstrated:
  • De novo binders (therapeutic antibodies)
  • Symmetric nanocages (drug delivery)
  • Custom enzyme active sites (RFdiffusion2, April 2025)

AlphaFold 2 serves as the verification step in this pipeline — closing the design loop by predicting whether designed sequences actually fold into the intended structures.

The open-source ecosystem

Model	Lab	Year	Key Feature
AlphaFold 3	DeepMind	2024	Gold standard; initially closed, later opened
Boltz-1	MIT	2024	Fully open-source, AF3-level accuracy, “Boltz-steering”
Chai-1	Chai Discovery	2024	Commercial; claims higher accuracy than AF3
OpenFold	Columbia	2022	Open AF2 reimplementation
RoseTTAFold	Baker Lab	2021–24	Independent architecture, extended to all-atom
ESMFold	Meta FAIR	2022	No MSA; 60× faster

Remaining open problems

Status	Problem
✅ Solved	Single-domain structure prediction; large-scale structural annotation (200M+ predictions)
🟡 Partial	Multi-domain proteins; stable protein complexes; antibody CDR-H3 loops
❌ Unsolved	Conformational ensembles (snapshot vs. movie); intrinsically disordered proteins; allosteric mechanisms; protein function prediction; folding pathways; membrane protein environments; post-translational modifications

The Nobel Prize (2024)

Laureate	Contribution
Demis Hassabis + John Jumper	Protein structure prediction (AlphaFold)
David Baker	Computational protein design (Rosetta, RFdiffusion)

The split is telling: Hassabis/Jumper = understanding (prediction), Baker = creation (design). Together they represent the full loop of programmable biology: predict structure → design new proteins → verify predictions.

Meta-lesson for AI

AlphaFold 2 wasn’t just a better model — it was a better formulation. The key innovations (treating structure as a graph with frames, using attention over evolution, learning to refine iteratively) came from deeply understanding the domain and finding the right inductive biases. Raw scale alone wouldn’t have worked. This lesson applies across all of AI — the best models come from understanding what the data fundamentally is, not just throwing more compute at it.

Quiz — Level 4

1. AlphaFold 3 replaced AF2’s deterministic Structure Module with a diffusion-based module. What is the most significant capability this enables?

Diffusion is generative: different noise seeds produce different structure samples. This lets AF3 explore structural uncertainty and produce multiple candidate conformations, unlike AF2’s single deterministic output. AF3 still uses MSA (processed upstream) and has a 4.4% chirality error rate.

2. ESMFold achieves ~76% monomer accuracy vs AlphaFold 2’s ~88%, but is 60× faster. What explains this performance gap?

ESMFold’s language model implicitly learns co-evolutionary patterns during pre-training, but this encoding is less precise than AF2’s explicit computation via MSA outer products and dedicated co-evolution attention mechanisms. The trade-off is speed (no MSA search needed) for accuracy.

3. RFdiffusion generates novel protein backbones, then ProteinMPNN designs sequences. Why is AlphaFold 2 still needed in this pipeline?

AF2 acts as the “oracle” to verify that designed sequences fold correctly. The pipeline is: RFdiffusion (design backbone) → ProteinMPNN (design sequence) → AlphaFold 2 (verify it folds into intended structure) → synthesize in lab.

4. Despite the Nobel Prize and AF3, the protein problem is considered “not solved.” Which limitation is fundamental?

The “snapshot vs movie” problem: proteins are dynamic molecular machines that constantly change shape. Current models produce a single static structure, missing conformational ensembles, allosteric transitions, and intrinsically disordered dynamics.

5. Boltz-1 (MIT, 2024) matches AF3 accuracy while being fully open-source and introduces “Boltz-steering.” What does this technique solve?

Boltz-steering guides the diffusion sampling process at inference time to improve physical plausibility — reducing steric clashes and chirality errors — without requiring retraining. It’s an inference-time intervention, not a training change.

← Back to all papers