What are CDR regions and why do they matter?

CDRs (Complementarity Determining Regions) are the hypervariable loops on an antibody that directly contact the antigen target. There are six CDR loops: three on the heavy chain (H1, H2, H3) and three on the light chain (L1, L2, L3). CDR-H3 is the most variable and usually the most important for binding specificity. Designing better CDR sequences is the key to engineering better antibodies.

What is the difference between ImmuneBuilder and AntiFold?

ImmuneBuilder predicts the 3D structure of an antibody from its heavy and light chain sequences. AntiFold does the inverse: given an antibody structure, it designs new CDR sequences that are likely to fold into that structure. Together they form a design-validate loop — use AntiFold to generate CDR candidates, then ImmuneBuilder to check if they fold correctly.

Can I design antibodies against a specific antigen?

The tools shown here design CDR sequences for a given antibody scaffold. For full antigen-specific design, you would need to combine these tools with docking (DiffDock or Boltz-2) to evaluate binding to your target antigen. SciRouter's Antibody Design Lab automates this pipeline.

How long does antibody structure prediction take?

ImmuneBuilder predictions typically complete in 10 to 30 seconds. AntiFold CDR design takes 5 to 15 seconds per set of sequences. Both run on GPU in the cloud and return results via the async job API.

Do I need paired heavy and light chain sequences?

ImmuneBuilder works best with paired heavy and light chain sequences. If you only have a single chain (for example, a VHH nanobody), ImmuneBuilder supports nanobody mode. AntiFold requires a structure as input, so you need to fold the antibody first.

How to Design Antibodies Programmatically with AI

Why Computational Antibody Design?

Therapeutic antibodies are among the most successful drug modalities in modern medicine, treating cancer, autoimmune diseases, and infectious diseases. Traditionally, discovering antibodies requires immunizing animals, screening hybridomas, and iterating through rounds of affinity maturation — a process that takes months and costs hundreds of thousands of dollars per campaign.

Computational antibody design accelerates this process by generating and evaluating candidate sequences in silico before ever touching a lab bench. AI models can predict antibody structures from sequence, design new CDR loops for better binding, and evaluate developability properties like aggregation and immunogenicity. What used to take weeks of wet lab work can now be done in minutes of compute time.

The Two-Step Pipeline: Fold, Then Design

The core workflow for programmatic antibody design uses two complementary AI models:

ImmuneBuilder: Predicts the 3D structure of an antibody from its heavy and light chain amino acid sequences. This gives you the scaffold structure that AntiFold needs as input.
AntiFold: An inverse folding model trained specifically on antibodies. Given a 3D antibody structure, it designs new CDR sequences that are predicted to fold into that structure while maintaining or improving binding properties.

Together, these models form a design loop: predict structure with ImmuneBuilder, design new CDRs with AntiFold, then validate the new sequences by folding them again with ImmuneBuilder.

Prerequisites

You need Python 3.7+ and a SciRouter API key. Sign up at scirouter.ai/register for 500 free credits per month.

Install the SDK

pip install scirouter

Set your API key

export SCIROUTER_API_KEY="sk-sci-your-api-key-here"

Understanding Antibody Structure

Before writing code, it helps to understand the basic anatomy of an antibody. An IgG antibody has two heavy chains and two light chains. The variable regions at the tips of each chain contain the antigen-binding site, formed by six CDR loops:

CDR-H1, CDR-H2, CDR-H3: On the heavy chain variable domain (VH). CDR-H3 is the most diverse and typically most critical for antigen binding specificity.
CDR-L1, CDR-L2, CDR-L3: On the light chain variable domain (VL). These contribute to binding but are generally less variable than the heavy chain CDRs.

The framework regions between CDRs provide structural scaffolding. Effective antibody design modifies the CDR sequences while preserving the framework, maintaining the overall fold while tuning binding specificity.

Step 1: Predict Antibody Structure with ImmuneBuilder

Start with heavy and light chain sequences. Here we use trastuzumab (Herceptin), a well-characterized anti-HER2 antibody, as our starting scaffold:

Fold an antibody with ImmuneBuilder

from scirouter import SciRouter

client = SciRouter()

# Trastuzumab variable region sequences
heavy_chain = (
    "EVQLVESGGGLVQPGGSLRLSCAASGFNIKDTYIHWVRQAPGKGLEWVARIYPTNGYTRYADSVKG"
    "RFTISADTSKNTAYLQMNSLRAEDTAVYYCSRWGGDGFYAMDYWGQGTLVTVSS"
)
light_chain = (
    "DIQMTQSPSSLSASVGDRVTITCRASQDVNTAVAWYQQKPGKAPKLLIYSASFLYSGVPSRFSGSR"
    "SGTDFTLTISSLQPEDFATYYCQQHYTTPPTFGQGTKVEIK"
)

# Predict the 3D structure
print("Folding antibody with ImmuneBuilder...")
structure = client.antibodies.fold(
    heavy_chain=heavy_chain,
    light_chain=light_chain,
)

print(f"Structure predicted. Confidence: {structure.mean_plddt:.1f}")
with open("trastuzumab.pdb", "w") as f:
    f.write(structure.pdb)
print("Saved to trastuzumab.pdb")

Note

ImmuneBuilder is specifically trained on antibody structures and models CDR loop conformations more accurately than general-purpose protein folding tools like ESMFold. It uses separate modules for the heavy chain, light chain, and their interface.

Step 2: Design New CDR Sequences with AntiFold

Now feed the predicted structure into AntiFold to generate new CDR sequences. AntiFold uses inverse folding — given the 3D backbone coordinates, it predicts amino acid sequences that are likely to fold into that structure. By restricting design to CDR regions, you get new binding loop sequences on the same structural scaffold:

Design CDR sequences with AntiFold

# Design new CDR sequences using the predicted structure
print("Designing CDR sequences with AntiFold...")
designs = client.antibodies.design(
    pdb=structure.pdb,
    num_sequences=10,        # generate 10 candidate sequences
    regions=["CDR-H3"],       # focus on CDR-H3 (most important for binding)
    temperature=0.2,          # lower = more conservative mutations
)

print(f"Generated {len(designs.sequences)} CDR-H3 variants:\n")
for i, seq in enumerate(designs.sequences):
    print(f"  Variant {i+1}: {seq.cdr_h3}")
    print(f"    Recovery: {seq.sequence_recovery:.1%}")
    print(f"    Log-likelihood: {seq.log_likelihood:.2f}")
    print()

Understanding the Output

cdr_h3: The designed CDR-H3 amino acid sequence.
sequence_recovery: The fraction of positions that match the original sequence. Higher recovery means more conservative design.
log_likelihood: The model confidence that this sequence will fold into the target structure. Higher values indicate better structural compatibility.

Step 3: Validate Designs by Re-Folding

A critical step in any computational design pipeline is validation. Fold each designed variant with ImmuneBuilder to verify that the new CDR sequences produce a well-folded structure:

Validate designed variants

# Validate top 3 designs by re-folding
top_designs = sorted(designs.sequences, key=lambda s: s.log_likelihood, reverse=True)[:3]

for i, design in enumerate(top_designs):
    # Construct full heavy chain with new CDR-H3
    new_heavy = design.full_heavy_chain

    print(f"Validating variant {i+1}...")
    validation = client.antibodies.fold(
        heavy_chain=new_heavy,
        light_chain=light_chain,  # keep light chain unchanged
    )

    print(f"  pLDDT: {validation.mean_plddt:.1f}")
    with open(f"variant_{i+1}.pdb", "w") as f:
        f.write(validation.pdb)

    # Flag designs with low confidence in CDR region
    if validation.mean_plddt < 70:
        print("  WARNING: Low confidence — CDR may not fold correctly")
    else:
        print("  PASS: Good structural confidence")

Tip

Designs with mean pLDDT above 80 and CDR-H3 pLDDT above 70 are strong candidates for experimental validation. Discard any designs where the CDR region has pLDDT below 50.

Complete Pipeline: From Sequence to Designed Variants

Here is the full end-to-end script that takes antibody sequences, folds the structure, designs CDR variants, validates them, and saves the results:

Full antibody design pipeline

import os
import sys
import json
from scirouter import SciRouter
from scirouter.exceptions import SciRouterError

api_key = os.environ.get("SCIROUTER_API_KEY")
if not api_key:
    print("Error: Set SCIROUTER_API_KEY")
    sys.exit(1)

client = SciRouter(api_key=api_key)

# Input sequences (trastuzumab VH/VL)
heavy = (
    "EVQLVESGGGLVQPGGSLRLSCAASGFNIKDTYIHWVRQAPGKGLEWVARIYPTNGYTRYADSVKG"
    "RFTISADTSKNTAYLQMNSLRAEDTAVYYCSRWGGDGFYAMDYWGQGTLVTVSS"
)
light = (
    "DIQMTQSPSSLSASVGDRVTITCRASQDVNTAVAWYQQKPGKAPKLLIYSASFLYSGVPSRFSGSR"
    "SGTDFTLTISSLQPEDFATYYCQQHYTTPPTFGQGTKVEIK"
)

# Step 1: Fold
print("=== Step 1: Fold antibody ===")
try:
    structure = client.antibodies.fold(heavy_chain=heavy, light_chain=light)
except SciRouterError as e:
    print(f"Folding failed: {e}")
    sys.exit(1)
print(f"Folded. pLDDT: {structure.mean_plddt:.1f}")

# Step 2: Design CDRs
print("\n=== Step 2: Design CDR-H3 variants ===")
try:
    designs = client.antibodies.design(
        pdb=structure.pdb,
        num_sequences=10,
        regions=["CDR-H1", "CDR-H2", "CDR-H3"],
        temperature=0.2,
    )
except SciRouterError as e:
    print(f"Design failed: {e}")
    sys.exit(1)
print(f"Generated {len(designs.sequences)} variants")

# Step 3: Validate top candidates
print("\n=== Step 3: Validate top 5 candidates ===")
ranked = sorted(designs.sequences, key=lambda s: s.log_likelihood, reverse=True)[:5]

results = []
for i, design in enumerate(ranked):
    validation = client.antibodies.fold(
        heavy_chain=design.full_heavy_chain,
        light_chain=light,
    )
    passed = validation.mean_plddt >= 70
    results.append({
        "variant": i + 1,
        "cdr_h3": design.cdr_h3,
        "log_likelihood": design.log_likelihood,
        "plddt": validation.mean_plddt,
        "passed": passed,
    })
    status = "PASS" if passed else "FAIL"
    print(f"  Variant {i+1}: pLDDT={validation.mean_plddt:.1f} [{status}]")

    with open(f"antibody_variant_{i+1}.pdb", "w") as f:
        f.write(validation.pdb)

# Save summary
with open("design_results.json", "w") as f:
    json.dump(results, f, indent=2)

passed_count = sum(1 for r in results if r["passed"])
print(f"\nDone. {passed_count}/{len(results)} variants passed validation.")
print("Results saved to design_results.json")

Designing Multiple CDR Regions

The examples above focus on CDR-H3, but you can design multiple CDR regions simultaneously. This is useful when you want to optimize the entire binding interface:

Multi-CDR design

# Design all six CDR regions simultaneously
designs = client.antibodies.design(
    pdb=structure.pdb,
    num_sequences=20,
    regions=["CDR-H1", "CDR-H2", "CDR-H3", "CDR-L1", "CDR-L2", "CDR-L3"],
    temperature=0.15,  # more conservative for multi-region design
)

for seq in designs.sequences[:5]:
    print(f"H1={seq.cdr_h1} H2={seq.cdr_h2} H3={seq.cdr_h3}")
    print(f"L1={seq.cdr_l1} L2={seq.cdr_l2} L3={seq.cdr_l3}")
    print(f"Log-likelihood: {seq.log_likelihood:.2f}")
    print()

Note

Multi-CDR design explores a much larger sequence space. Use lower temperatures (0.1 to 0.2) to keep designs close to the original scaffold, or higher temperatures (0.3 to 0.5) for more diverse exploration. Always validate with re-folding.

Nanobody Design

ImmuneBuilder also supports nanobodies (VHH domains), which are single-domain antibodies derived from camelid heavy-chain antibodies. Nanobodies are smaller, more stable, and easier to produce than conventional antibodies:

Design nanobody CDRs

# Fold a nanobody (single heavy chain, no light chain)
nanobody_seq = (
    "QVQLVESGGGLVQAGGSLRLSCAASGRTFSSYAMGWFRQAPGKEREFVAAINWSSGSTYYADSVKG"
    "RFTISRDNAKNTVYLQMNSLKPEDTAVYYCAADSTIYASYYECGHGLSTGGYDYWGQGTQVTVSS"
)

nanobody_structure = client.antibodies.fold(
    heavy_chain=nanobody_seq,
    mode="nanobody",  # single-domain mode
)

# Design CDR variants for the nanobody
nanobody_designs = client.antibodies.design(
    pdb=nanobody_structure.pdb,
    num_sequences=10,
    regions=["CDR-H3"],
    temperature=0.2,
)

for i, seq in enumerate(nanobody_designs.sequences[:5]):
    print(f"Nanobody variant {i+1}: CDR-H3 = {seq.cdr_h3}")

What Running These Tools Locally Requires

For context, here is what you would need to set up ImmuneBuilder and AntiFold locally:

ImmuneBuilder: PyTorch, OpenMM, pdbfixer, ANARCI for numbering, ~2 GB model weights, GPU recommended
AntiFold: PyTorch Geometric, ESM library, custom trained weights, CUDA-compatible GPU required
Both require specific Python versions and careful dependency management
ANARCI (antibody numbering) has its own installation complexities with HMMER
Total setup time: 1 to 3 hours for an experienced engineer

The SciRouter API eliminates all of this. Both models are pre-deployed on GPU instances and accessible through a unified SDK.

Next Steps

Now that you can design antibodies programmatically, explore related capabilities on SciRouter. Use ImmuneBuilder for structure prediction and AntiFold for CDR design. To evaluate binding to a specific antigen, dock your designed antibodies with DiffDock or predict complex structures with Boltz-2.

For a fully automated pipeline that goes from antigen to ranked antibody candidates, check out SciRouter Labs — our end-to-end antibody discovery workflow.

How to Design Antibodies Programmatically with AI

Why Computational Antibody Design?

The Two-Step Pipeline: Fold, Then Design

Prerequisites

Understanding Antibody Structure

Step 1: Predict Antibody Structure with ImmuneBuilder

Step 2: Design New CDR Sequences with AntiFold

Understanding the Output

Step 3: Validate Designs by Re-Folding

Complete Pipeline: From Sequence to Designed Variants

Designing Multiple CDR Regions

Nanobody Design

What Running These Tools Locally Requires

Next Steps

Frequently Asked Questions

What are CDR regions and why do they matter?

What is the difference between ImmuneBuilder and AntiFold?

Can I design antibodies against a specific antigen?

How long does antibody structure prediction take?

Do I need paired heavy and light chain sequences?

Related Tools

ImmuneBuilder — Antibody Structure

AntiFold — Antibody CDR Design

Try It Free

Protein Function Predictor

More in the API Guides Series

How to Predict Protein Structure from Sequence Using an API

How to Predict Protein Structure via API (No GPU Needed)

How to Dock Molecules with DiffDock API in Python

Try this yourself