DockingAPI Guides

How to Dock Molecules with DiffDock API in Python

Step-by-step tutorial for molecular docking with DiffDock via the SciRouter API. Prepare protein and ligand inputs, submit a docking job, and interpret binding poses — all in Python.

Ryan Bethencourt
March 20, 2026
10 min read

What is Molecular Docking?

Molecular docking predicts how a small molecule (a drug candidate) binds to a protein target. The output is a 3D binding pose — the position, orientation, and conformation of the ligand within the protein binding site. This is a fundamental step in computational drug discovery, used to screen compounds, understand binding mechanisms, and guide lead optimization.

Traditional docking tools like AutoDock Vina require you to define a search box, prepare receptor files in PDBQT format, and install complex software stacks. DiffDock replaces this entire pipeline with a diffusion generative model that predicts binding poses end-to-end, without a predefined search box.

Why Use the DiffDock API?

Running DiffDock locally requires cloning the repository, setting up a specific conda environment with PyTorch Geometric, torch-scatter, torch-sparse, and dozens of other dependencies, plus a GPU with at least 8 GB of VRAM. The environment setup alone takes 30 to 60 minutes and is notoriously fragile across different CUDA versions.

The SciRouter API eliminates all of this. You send a protein structure and a SMILES string over HTTPS, and you get back predicted binding poses with confidence scores. The inference runs on A100 GPUs in the cloud.

Prerequisites

You need Python 3.7 or later and a SciRouter API key. Sign up at scirouter.ai/register to get 500 free credits per month. Install the SDK:

Install dependencies
pip install scirouter
Set your API key
export SCIROUTER_API_KEY="sk-sci-your-api-key-here"

Step 1: Prepare Your Inputs

DiffDock needs two inputs: a protein structure in PDB format and a ligand as a SMILES string. If you already have a PDB file from experiment or another prediction tool, read it from disk. If you only have a protein sequence, fold it first with ESMFold:

Prepare protein and ligand inputs
from scirouter import SciRouter

client = SciRouter()

# Option A: Load PDB from file
with open("target_protein.pdb") as f:
    protein_pdb = f.read()

# Option B: Fold the protein first (if you only have a sequence)
fold_result = client.proteins.fold(
    sequence="MTEYKLVVVGAGGVGKSALTIQLIQNHFVDEYDPTIEDSY"  # KRAS fragment
)
protein_pdb = fold_result.pdb

# Define the ligand as a SMILES string
# This is sotorasib (Lumakras), a KRAS G12C inhibitor
ligand_smiles = "C=CC(=O)N1CCN(CC1)c1c(F)cc(NC(=O)c2cc(OC)c(N3CCN(C)CC3)nc2)c(F)c1"
Tip
You can find SMILES strings for known drugs on PubChem, ChEMBL, or DrugBank. For custom molecules, draw them in a structure editor and export as SMILES.

Step 2: Submit the Docking Job

With inputs prepared, submit the docking job. The SDK handles the async job lifecycle internally — it submits the request, polls for completion, and returns the result:

Submit docking job
# Run DiffDock docking
docking_result = client.docking.diffdock(
    protein_pdb=protein_pdb,
    ligand_smiles=ligand_smiles,
    num_poses=5,  # number of binding pose predictions to return
)

print(f"Returned {len(docking_result.poses)} binding poses")
print(f"Top pose confidence: {docking_result.poses[0].confidence:.3f}")

Step 3: Interpret the Results

DiffDock returns multiple predicted binding poses, each with a confidence score and a ligand PDB structure. The poses are ranked by confidence, with the most likely binding mode listed first.

Process and save docking results
# Examine each predicted pose
for i, pose in enumerate(docking_result.poses):
    print(f"Pose {i+1}: confidence = {pose.confidence:.3f}")

    # Save the docked ligand pose as a PDB file
    with open(f"pose_{i+1}.pdb", "w") as f:
        f.write(pose.ligand_pdb)

# Save the top-ranked pose alongside the protein
with open("complex_top_pose.pdb", "w") as f:
    f.write(protein_pdb)
    f.write(docking_result.poses[0].ligand_pdb)

print("Docking complete. Open complex_top_pose.pdb in PyMOL or ChimeraX.")
Note
Higher confidence scores indicate the model is more certain about the binding pose. In practice, examine the top 3 poses — DiffDock may place the ligand in alternative binding modes that are all physically reasonable.

Full Working Example: End-to-End Docking Pipeline

Here is a complete script that takes a protein sequence and a ligand SMILES, folds the protein, docks the ligand, and saves the results:

Complete fold-then-dock pipeline
import os
import sys
from scirouter import SciRouter
from scirouter.exceptions import SciRouterError

api_key = os.environ.get("SCIROUTER_API_KEY")
if not api_key:
    print("Error: Set SCIROUTER_API_KEY environment variable")
    sys.exit(1)

client = SciRouter(api_key=api_key)

# Step 1: Fold the target protein
print("Folding protein...")
fold = client.proteins.fold(
    sequence="MTEYKLVVVGAGGVGKSALTIQLIQNHFVDEYDPTIEDSY",
    model="esmfold",
)
print(f"Fold complete. pLDDT: {fold.mean_plddt:.1f}")

# Step 2: Dock a ligand
print("Docking ligand...")
try:
    dock = client.docking.diffdock(
        protein_pdb=fold.pdb,
        ligand_smiles="C=CC(=O)N1CCN(CC1)c1c(F)cc(NC(=O)c2cc(OC)c(N3CCN(C)CC3)nc2)c(F)c1",
        num_poses=5,
    )
except SciRouterError as e:
    print(f"Docking failed: {e}")
    sys.exit(1)

# Step 3: Save results
print(f"Got {len(dock.poses)} poses")
for i, pose in enumerate(dock.poses):
    with open(f"pose_{i+1}.pdb", "w") as f:
        f.write(pose.ligand_pdb)
    print(f"  Pose {i+1}: confidence = {pose.confidence:.3f}")

with open("protein.pdb", "w") as f:
    f.write(fold.pdb)

print("Done. Visualize protein.pdb + pose_1.pdb together in PyMOL.")

Virtual Screening: Docking Multiple Ligands

The real power of API-based docking is throughput. Instead of docking one molecule at a time, you can screen hundreds of compounds against the same target using concurrent requests:

Screen multiple ligands in parallel
from concurrent.futures import ThreadPoolExecutor, as_completed
from scirouter import SciRouter

client = SciRouter()

# Load protein structure
with open("target.pdb") as f:
    protein_pdb = f.read()

# List of candidate ligands
ligands = {
    "aspirin": "CC(=O)Oc1ccccc1C(=O)O",
    "ibuprofen": "CC(C)Cc1ccc(cc1)C(C)C(=O)O",
    "caffeine": "Cn1c(=O)c2c(ncn2C)n(C)c1=O",
    "acetaminophen": "CC(=O)Nc1ccc(O)cc1",
}

def dock_one(name, smiles):
    result = client.docking.diffdock(
        protein_pdb=protein_pdb,
        ligand_smiles=smiles,
        num_poses=3,
    )
    return name, result.poses[0].confidence

results = {}
with ThreadPoolExecutor(max_workers=4) as pool:
    futures = {pool.submit(dock_one, n, s): n for n, s in ligands.items()}
    for future in as_completed(futures):
        name, confidence = future.result()
        results[name] = confidence
        print(f"{name}: top-pose confidence = {confidence:.3f}")

# Rank by confidence
ranked = sorted(results.items(), key=lambda x: x[1], reverse=True)
print("\nRanking:")
for rank, (name, conf) in enumerate(ranked, 1):
    print(f"  {rank}. {name} ({conf:.3f})")

Comparison: Local DiffDock vs API

Local Setup

  • Clone the DiffDock repository and resolve submodules
  • Create a conda environment with Python 3.9 and specific PyTorch version
  • Install PyTorch Geometric, torch-scatter, torch-sparse, torch-cluster (version-locked to CUDA)
  • Download pretrained model weights
  • Prepare input files in the exact expected directory structure
  • GPU with 8 GB+ VRAM required
  • 30 to 90 minutes of setup time

API Setup

  • pip install scirouter (30 seconds)
  • One environment variable for authentication
  • Send JSON, receive JSON — no file format gymnastics
  • Works from any machine, including serverless functions

Visualizing Docking Results

The best way to evaluate docking results is visual inspection. Open the protein PDB and the top ligand pose PDB together in PyMOL:

Open in PyMOL
pymol protein.pdb pose_1.pdb

In PyMOL, show the protein as a cartoon representation and the ligand as sticks. Check whether the ligand sits in a plausible binding pocket and whether key interactions (hydrogen bonds, hydrophobic contacts) are present. For Jupyter notebooks, use NGLview or py3Dmol for inline 3D rendering.

Next Steps

Now that you can dock molecules programmatically, combine it with other SciRouter tools for a complete drug discovery workflow. Use DiffDock for AI docking, screen compounds for drug-likeness with ADMET screening, or predict protein structures with ESMFold before docking.

Sign up at scirouter.ai/register for 500 free credits and start docking molecules in minutes.

Frequently Asked Questions

What inputs does DiffDock need?

DiffDock requires a protein structure (PDB format) and a ligand molecule (SMILES string). The API accepts both as part of a single JSON request. If you do not have a PDB file, you can first predict the structure with ESMFold via SciRouter.

How is DiffDock different from AutoDock Vina?

AutoDock Vina uses physics-based scoring functions and requires you to define a search box around the binding site. DiffDock uses a diffusion generative model that predicts the binding pose directly, without a predefined search box. DiffDock is often more accurate for blind docking where the binding site is unknown.

How long does a DiffDock prediction take?

Most predictions complete in 15 to 45 seconds on an A100 GPU. The API returns a job ID immediately so your application is not blocked during inference. Complex protein-ligand pairs may take up to 90 seconds.

Can I dock multiple ligands against the same protein?

Yes. Submit separate docking jobs for each ligand, each with the same PDB structure and a different SMILES string. Use concurrent requests to parallelize across ligands. This is the standard virtual screening workflow.

What does the confidence score mean?

DiffDock returns a confidence score for each predicted binding pose. Higher scores indicate the model is more confident that the pose is physically plausible. Use confidence to rank poses when multiple are returned.

Try It Free

No Login Required

Try this yourself

500 free credits. No credit card required.