RFdiffusion is a generative diffusion model for protein structure. It was released in 2023 by the Baker Lab at the University of Washington and adapts the denoising-diffusion framework used in image generation to the problem of building new protein backbones from scratch. You give it a target protein and a few hotspot residues, and it iteratively denoises random 3D coordinates into a plausible binder backbone.

What is the difference between RFdiffusion and RFdiffusion2?

RFdiffusion1 was trained primarily on structure-only objectives and works well for monomer and binder generation. RFdiffusion2 extends the model with better conditioning, improved enzyme active-site scaffolding, and stronger performance on multi-chain targets. RFdiffusion2 is the current state of the art for practical binder design workflows.

How does RFdiffusion differ from AlphaFold?

AlphaFold predicts a structure from a sequence. RFdiffusion goes the other direction: it generates a backbone without any starting sequence, then a partner model like ProteinMPNN writes a sequence that will fold into that backbone. The two tools are complementary, not competitors.

What are hotspot residues?

Hotspots are residues on the target protein that you want the new binder to make contact with. In practice you pick 3 to 6 residues that sit on the surface you want to block, such as an interface, an active site, or a drug pocket, and RFdiffusion biases the generated backbone to place atoms near those residues.

What are typical wet-lab success rates for RFdiffusion?

In the 2023 Nature paper, RFdiffusion plus ProteinMPNN plus AlphaFold filtering yielded binders with single-digit to low-double-digit percentage wet-lab success rates across several hard targets, a dramatic jump over prior de novo methods. More recent pipelines like BindCraft report higher hit rates by combining diffusion, co-folding, and multi-stage filtering.

Can I run RFdiffusion on SciRouter?

Yes. SciRouter hosts RFdiffusion as a managed GPU endpoint. You send a JSON payload with your target PDB, hotspot residues, and length range, and the API returns backbone coordinates. From there you can chain ProteinMPNN for sequence design and Boltz-2 for validation.

What is RFdiffusion? The Ultimate Guide to AI Protein Binder Design

Designing a protein that binds a new target used to take years of biophysics, combinatorial library screening, and crystal trays full of luck. In 2023, RFdiffusion changed that. It brought the same denoising-diffusion machinery that powers modern image generators to the problem of building protein backbones, and it did so with wet-lab success rates that were an order of magnitude higher than prior methods.

This guide walks through what RFdiffusion actually does, how the math works at a conceptual level, the jump from RFdiffusion1 to RFdiffusion2, how it compares to DeepMind's AlphaProteo, and how to try it yourself inside SciRouter's Binder Design studio.

Diffusion models, but for protein backbones

A denoising diffusion model learns to reverse a noise process. During training, the model is shown real examples (images in Stable Diffusion, protein structures in RFdiffusion) and watches them get gradually corrupted into pure Gaussian noise. Its job is to learn how to step backwards, turning noise back into something recognizable.

For proteins, the “image” is a set of 3D backbone coordinates. At the start of generation, every residue is placed at a random point in space. At each denoising step, the model predicts where each residue should actually live, and moves the structure a little closer to a plausible protein. After around 50 steps, you have a clean backbone that looks like something a ribosome might actually produce.

Note

The magic of diffusion models is that you can steer the generation. Instead of producing any plausible protein, you can condition the model on a target and ask: give me a backbone that folds into a binder for this specific surface. That is exactly what RFdiffusion does for binder design.

How RFdiffusion generates a binder

A typical binder-design run has three inputs:

The target protein structure in PDB format. This is the thing you want to bind.
Hotspot residues. Usually 3 to 6 residues on the target surface where you want the binder to make contact.
A length range. Typically 60 to 120 residues. Short enough to express easily, long enough to form a stable fold.

Given those inputs, RFdiffusion generates a new backbone that physically clashes with the target surface at the hotspots. It does not produce a sequence. That is a separate step.

Step 1: Generate a backbone

RFdiffusion runs the diffusion process with the target structure frozen in place. The new chain starts as noise and gets progressively sculpted into a fold that matches the geometry of the hotspot region. The output is backbone atom coordinates only. No side chains.

Step 2: Design a sequence with ProteinMPNN

ProteinMPNN is a graph neural network that answers the inverse question: given a backbone, what sequence would fold into it? You feed it the RFdiffusion backbone and it writes several candidate sequences. These sequences are not guesses, they are the output of a model trained on tens of thousands of real PDB structures.

Step 3: Validate with AlphaFold or Boltz-2

Not every designed sequence will actually fold back into the intended backbone. The final step is to feed each candidate to a structure predictor like AlphaFold2 or Boltz-2 and measure how well the predicted structure matches the target backbone. Candidates that fold correctly and show a plausible binding pose survive into wet-lab testing.

RFdiffusion1 vs RFdiffusion2

The original RFdiffusion paper established the method. RFdiffusion2, released in 2025, improved on it in several practical ways:

Better multi-chain conditioning. RFdiffusion2 is much better at respecting target geometry when the binder has to slot between multiple chains or contact a flexible loop.
Enzyme active-site scaffolding. The model can regenerate a catalytic triad or metal-binding pocket inside a new backbone, which is a huge step toward de novo enzyme design.
Improved in-silico success rates. More designs pass the AlphaFold self-consistency check, which usually translates into more designs that also work in the wet lab.
Better support for partial diffusion. You can keep part of an existing structure fixed and only diffuse a new region, which is useful for grafting motifs or humanizing scaffolds.

For most practical binder projects, RFdiffusion2 is the right starting point. SciRouter's managed RFdiffusion endpoint runs the latest version with the common presets already configured.

RFdiffusion vs AlphaProteo

AlphaProteo is DeepMind's 2024 entry into de novo binder design. It takes a different technical approach, it is closed source, and it is only available through DeepMind's hosted offering. A fair comparison looks like this:

Access. RFdiffusion is open source and freely available. AlphaProteo is not.
Reported hit rates. Both report high double-digit in-silico success rates and low double-digit wet-lab success rates, though direct head-to-head numbers are hard to pin down because the benchmarks differ.
Transparency. RFdiffusion publishes weights, training data, and code. You can fine-tune it on your own targets. AlphaProteo is a closed API.
Tooling ecosystem. RFdiffusion has the largest community, the most tutorials, and the most open-source follow-up work. BindCraft, ProteinMPNN, and Boltz-2 all integrate natively.

Wet-lab success rates in practice

The original RFdiffusion paper reported success rates between about 1 and 20 percent depending on target difficulty, where success means a designed protein that shows specific, high-affinity binding to the target by biolayer interferometry or similar assays. Targets with flat, featureless surfaces were harder. Targets with a well-defined pocket or concave interface were easier.

Modern pipelines stack several filters on top of raw RFdiffusion output, and the published results are much higher. BindCraft, the one-shot pipeline that SciRouter exposes directly, combines structure generation, co-folding validation, and ranking, and reports hit rates approaching 30 percent on some targets. You can read a hands-on tutorial in our BindCraft guide.

Warning

Wet-lab success rates depend heavily on the target. A surface with a deep, pre-formed pocket will always outperform a flat, exposed-loop target. Quoted hit rates are meaningful as method comparisons but should not be treated as a guarantee for your specific protein.

What to try next

If you have a target in mind and want to generate your first binder without running a GPU yourself, the fastest path is:

Open the Binder Design studio and pick your target from the PDB or upload a structure.
Mark 3 to 5 hotspot residues on the surface you want to block.
Run the managed RFdiffusion endpoint to produce 40 to 100 candidate backbones.
Chain ProteinMPNN for sequence design and Boltz-2 or AlphaFold for filtering.
Order the top ~10 sequences as gene fragments and express them.

Bottom line

RFdiffusion is the model that made generative protein design practical. It is open, well documented, battle-tested in real wet-lab campaigns, and available through hosted endpoints so you do not need your own A100 to use it. If you are getting started with binder design in 2026, this is the model to learn first.

Start designing binders in the studio →