ProteinsESMFold

ProteinMPNN vs RFdiffusion vs Chroma: AI Protein Design Tools Compared

Compare ProteinMPNN, RFdiffusion, and Chroma for AI protein design. Inverse folding vs backbone generation vs generative design — architecture, use cases, and when to use each.

Ryan Bethencourt
May 6, 2026
12 min read

Why AI Protein Design Tools Matter

Designing proteins with specific structures and functions is one of the most impactful applications of machine learning in biology. Three tools have defined the current landscape: ProteinMPNN for inverse folding, RFdiffusion for backbone generation, and Chroma for generative protein design. Each solves a different part of the design problem, and understanding when to use which tool can save weeks of experimental effort.

This guide provides a practical comparison of all three tools across architecture, use cases, compute requirements, and accessibility. Whether you are designing binders, engineering enzymes, or exploring novel folds, this comparison will help you choose the right tool for your project.

What Each Tool Does

ProteinMPNN: Inverse Folding (Sequence Design)

Published by the Baker Lab in 2022, ProteinMPNN solves the inverse folding problem: given a 3D protein backbone, it designs amino acid sequences predicted to fold into that structure. It uses a message-passing neural network that operates on the protein graph (residues as nodes, spatial contacts as edges) to predict sequences with high recovery rates.

  • Input: Protein backbone coordinates (PDB file or coordinates)
  • Output: Designed amino acid sequences with confidence scores
  • Architecture: Message-passing neural network on protein graphs
  • Published: September 2022 (Dauparas et al., Science)

RFdiffusion: Backbone Generation

Published by the Baker Lab in 2023, RFdiffusion generates novel protein backbone structures using denoising diffusion. Built on the RoseTTAFold architecture, it can create backbones from scratch or conditioned on specific constraints such as binding a target protein, scaffolding a functional motif, or incorporating symmetry.

  • Input: Design constraints (target protein, motif, symmetry, or unconditional)
  • Output: Novel protein backbone 3D coordinates
  • Architecture: Denoising diffusion on RoseTTAFold structure prediction network
  • Published: July 2023 (Watson et al., Nature)

Chroma: Generative Protein Design

Published by Generate Biomedicines in 2023, Chroma is a generative model that can produce novel protein structures conditioned on high-level properties. It uses a diffusion process over protein structure and sequence simultaneously, enabling generation of proteins with desired symmetry groups, shape constraints, or functional annotations.

  • Input: Property constraints (symmetry, substructure, natural language prompts)
  • Output: Full protein structures (backbone and sequence together)
  • Architecture: Score-based diffusion with a graph neural network denoiser
  • Published: March 2023 (Ingraham et al., Nature)

Head-to-Head Comparison

Problem Solved

  • ProteinMPNN: Sequence design for a fixed backbone. You already have the structure you want; you need a sequence that folds into it.
  • RFdiffusion: Backbone generation with structural constraints. You know the function (e.g., bind this target) but need a new structure that achieves it.
  • Chroma: Exploratory structure generation. You want to sample diverse protein architectures with optional property conditioning.

Experimental Validation

  • ProteinMPNN: Extensively validated. Designed sequences consistently fold as predicted in the lab, with sequence recovery rates above 50% on native backbones. Widely adopted in dozens of published experimental studies.
  • RFdiffusion: Strong experimental results for binder design and motif scaffolding. De novo binders to targets like the insulin receptor and PD-L1 have been validated experimentally.
  • Chroma: Early experimental validation shows generated proteins express and fold, though fewer independent experimental studies have been published compared to ProteinMPNN and RFdiffusion.

Compute Requirements

  • ProteinMPNN: Lightweight. Runs on a single GPU in seconds or even on CPU. Batch design of hundreds of sequences is fast and inexpensive.
  • RFdiffusion: Moderate to heavy. Requires a GPU with 16GB+ VRAM. Each diffusion trajectory takes 1 to 5 minutes depending on protein size and number of steps.
  • Chroma: Heavy. Requires a high-end GPU (A100 recommended). Generation is slower than RFdiffusion for equivalent-sized proteins due to joint structure-sequence diffusion.
Tip
ProteinMPNN is the most compute-efficient of the three tools. If you already have a backbone structure (from crystallography, RFdiffusion, or another source), ProteinMPNN can design sequences in seconds via SciRouter's API with no GPU setup required.

Open Source and Licensing

  • ProteinMPNN: Fully open source (MIT license). Code and weights available on GitHub.
  • RFdiffusion: Open source with a non-commercial license (BSD-style with restrictions). Free for academic use; commercial use requires a license from UW.
  • Chroma: Open source under Apache 2.0 license. Code and weights released by Generate Biomedicines.

When to Use Each Tool

Choose ProteinMPNN When:

  • You have a backbone structure and need sequences that fold into it
  • You are redesigning an existing protein for improved stability or expression
  • You need to design sequences for a scaffold from RFdiffusion or other backbone generators
  • You want fast turnaround: seconds per design, not minutes
  • You need experimentally reliable results with strong published validation

Choose RFdiffusion When:

  • You need a completely new protein backbone that binds a specific target
  • You are scaffolding a functional motif (e.g., placing a catalytic site in a new protein)
  • You want to design symmetric assemblies (dimers, trimers, cages)
  • You have a clear structural constraint but no starting backbone

Choose Chroma When:

  • You want to explore diverse protein architectures without specific structural constraints
  • You are conditioning generation on high-level properties (symmetry, shape class)
  • You need joint backbone-and-sequence generation in a single step
  • You are running exploratory research to discover novel protein topologies

Typical Protein Design Workflow

In practice, these tools are most powerful when used together. A common workflow chains backbone generation with sequence design and then validates the result with structure prediction:

  • Step 1: Generate a backbone with RFdiffusion (or start from an existing PDB structure)
  • Step 2: Design sequences for that backbone with ProteinMPNN
  • Step 3: Validate designed sequences by folding them with ESMFold or AlphaFold2
  • Step 4: Compare the predicted structure to the design target (RMSD check)
  • Step 5: Send top candidates to the lab for experimental validation
Note
SciRouter provides both ProteinMPNN and ESMFold as hosted API endpoints, covering Steps 2 and 3 of this workflow with no infrastructure to manage.

Using ProteinMPNN via SciRouter API

SciRouter hosts ProteinMPNN as a managed API endpoint. You can design sequences for any backbone structure with a single API call, no GPU setup, no model download, and no dependency management.

Design sequences with ProteinMPNN via SciRouter
from scirouter import SciRouter

client = SciRouter(api_key="sk-sci-your-api-key")

# Design sequences for a backbone structure
result = client.design.proteinmpnn(
    pdb_id="1QYS",           # PDB ID or upload coordinates
    num_sequences=8,          # Number of sequences to design
    temperature=0.1,          # Lower = more conservative designs
    chain="A"                 # Target chain
)

for seq in result.sequences:
    print(f"Sequence: {seq.sequence[:40]}...")
    print(f"Score: {seq.score:.3f}")
    print(f"Recovery: {seq.recovery:.1%}")
    print()

You can chain ProteinMPNN with ESMFold to validate designs in the same script:

Design + validate pipeline
from scirouter import SciRouter

client = SciRouter(api_key="sk-sci-your-api-key")

# Step 1: Design sequences
designs = client.design.proteinmpnn(
    pdb_id="1QYS",
    num_sequences=4,
    temperature=0.1
)

# Step 2: Validate each design with ESMFold
for seq in designs.sequences:
    fold = client.proteins.fold(
        sequence=seq.sequence,
        model="esmfold"
    )
    print(f"Score: {seq.score:.3f} | pLDDT: {fold.mean_plddt:.1f}")
    if fold.mean_plddt > 80:
        print("  -> High confidence fold. Good candidate.")
Tip
This two-step design-and-validate pipeline runs entirely through SciRouter's API. No local GPU, no model installation, and no dependency conflicts. Start with the free tier to test your designs.

Architecture Comparison at a Glance

Understanding the architectural differences helps explain why each tool excels at different tasks:

  • ProteinMPNN uses a message-passing neural network that propagates information along the edges of a protein structure graph. This graph-based approach is naturally suited to reasoning about local structural contacts and designing sequences that satisfy spatial constraints.
  • RFdiffusion adapts the RoseTTAFold structure prediction network for generative use via denoising diffusion. The model starts from random noise and iteratively refines it into a valid protein backbone, guided by optional conditioning signals like a binding target.
  • Chroma uses a score-based diffusion framework with a graph neural network that operates on both backbone coordinates and sequence identity simultaneously. This joint generation avoids the two-step backbone-then-sequence approach but requires more compute.

Summary: Choosing the Right Tool

The choice between ProteinMPNN, RFdiffusion, and Chroma depends on where you are in the design process:

  • Have a backbone, need a sequence? Use ProteinMPNN.
  • Need a new backbone for a specific function? Use RFdiffusion.
  • Want to explore diverse protein architectures? Use Chroma.
  • Building a full pipeline? Use RFdiffusion for backbone generation, ProteinMPNN for sequence design, and ESMFold for validation.

ProteinMPNN is available now on SciRouter with free credits to get started. Read our ProteinMPNN tutorial for a step-by-step walkthrough, or explore the ProteinMPNN tool page to see full API documentation.

Sign up for a free API key and design your first protein sequence in under a minute. No GPU setup, no model downloads, no dependency management.

Frequently Asked Questions

What is the difference between ProteinMPNN and RFdiffusion?

ProteinMPNN is an inverse folding model: given a 3D backbone structure, it designs amino acid sequences that fold into that shape. RFdiffusion is a backbone generation model: it creates entirely new protein backbone structures from scratch using denoising diffusion. They solve complementary problems and are often used together in a pipeline where RFdiffusion generates the backbone and ProteinMPNN designs a sequence for it.

Which AI protein design tool is best for beginners?

ProteinMPNN is the most accessible starting point. It has clear inputs (a PDB structure) and outputs (designed sequences), runs on a single GPU or via API, and has the most published experimental validation. RFdiffusion and Chroma require more computational resources and expertise to use effectively.

Can I use ProteinMPNN without a GPU?

Yes. ProteinMPNN is lightweight enough to run on CPU, though GPU accelerates batch processing. Via SciRouter's API, you can run ProteinMPNN with a single API call and no local hardware at all.

What is Chroma used for in protein design?

Chroma is a generative model for protein design developed by Generate Biomedicines. It can create novel protein structures conditioned on properties like symmetry, shape, and function. It is best suited for exploratory design where you want to sample diverse protein architectures.

How do RFdiffusion and Chroma compare for backbone generation?

Both generate novel protein backbones using diffusion-based approaches, but they differ in conditioning. RFdiffusion excels at targeted tasks like binder design and motif scaffolding with specific structural constraints. Chroma is stronger at unconditional generation and property-conditioned sampling across diverse folds.

Try It Free

No Login Required

Try this yourself

500 free credits. No credit card required.