ProteinsAPI Guides

How to Predict Protein Structure via API (No GPU Needed)

Predict protein 3D structure from amino acid sequence using the SciRouter ESMFold API. No GPU, no Docker, no model weights — just pip install and three lines of Python.

Ryan Bethencourt
March 20, 2026
10 min read

The Problem: Protein Folding Requires Serious Hardware

Running ESMFold locally means downloading a 700 million parameter model, installing PyTorch with CUDA support, provisioning a GPU with at least 16 GB of VRAM, and debugging cryptic dependency conflicts between torch, fair-esm, and openfold. On a typical workstation without an NVIDIA A100 or equivalent, inference either fails outright or takes minutes per sequence instead of seconds.

For researchers who just want a PDB file from a sequence, this setup overhead is a barrier. For software engineers integrating structure prediction into a pipeline, it is a maintenance burden. And for AI agents that need to call protein folding as a tool, it is simply not practical.

The Solution: Call ESMFold as an API

SciRouter hosts ESMFold on dedicated A100 GPUs and exposes it as a simple REST endpoint. You send an amino acid sequence, the server runs inference, and you get back a PDB structure file with per-residue confidence scores. No GPU provisioning, no model downloads, no dependency management.

Here is what the full workflow looks like: install the SDK, set your API key, and call one function. Three lines of meaningful code.

Prerequisites

You need Python 3.7 or later and a SciRouter API key. Sign up at scirouter.ai/register to get 500 free credits per month with no credit card required.

Step 1: Install the SciRouter SDK

The SDK is a thin wrapper around the REST API that handles authentication, polling for async job results, and type-safe response parsing.

Install the SDK
pip install scirouter

Step 2: Set Your API Key

Store your API key as an environment variable. The SDK reads it automatically from SCIROUTER_API_KEY so you never need to hardcode it.

Export your API key
export SCIROUTER_API_KEY="sk-sci-your-api-key-here"

Step 3: Fold a Protein in Three Lines

This is the minimal example. It sends a hemoglobin alpha chain fragment to ESMFold and saves the predicted structure as a PDB file.

Minimal protein folding (3 lines)
from scirouter import SciRouter

client = SciRouter()  # reads SCIROUTER_API_KEY from env
result = client.proteins.fold(sequence="MVLSPADKTNVKAAWGKVGAHAGEYGAEALERMFLSFPTTKTYFPHFDLSH")

print(f"Confidence (pLDDT): {result.mean_plddt:.1f}")
open("structure.pdb", "w").write(result.pdb)
Tip
The SDK handles job submission and polling internally. The fold method blocks until the prediction is complete and returns a typed result object.

What You Get Back

The result object contains everything you need to work with the predicted structure:

  • result.pdb: A PDB-format string with 3D atomic coordinates for every atom in the predicted structure.
  • result.mean_plddt: The average predicted Local Distance Difference Test score across all residues, ranging from 0 to 100.
  • result.plddt_per_residue: A list of per-residue confidence scores, useful for identifying well-folded regions versus disordered loops.
  • result.job_id: A unique identifier you can use to retrieve the result later.
Note
A mean pLDDT above 70 generally indicates a reliable fold. Scores above 90 suggest the prediction is accurate at near-experimental resolution. Scores below 50 often indicate intrinsically disordered regions that do not form stable structures.

Comparison: Local ESMFold vs API

To put the convenience in perspective, here is what running ESMFold locally requires compared to the API approach:

Local Setup

  • NVIDIA GPU with 16 GB+ VRAM (A100 recommended, consumer GPUs may run out of memory on longer sequences)
  • CUDA 11.7 or 12.x installed and configured
  • PyTorch 2.x with matching CUDA version
  • fair-esm library with ESMFold dependencies (openfold, biopython, einops)
  • 700 MB+ of model weights downloaded on first run
  • Docker or conda environment to isolate dependencies
  • 15 to 60 minutes of setup time for an experienced engineer

API Setup

  • Python 3.7+ on any machine (no GPU needed)
  • One pip install command
  • One environment variable
  • Under 2 minutes from start to first prediction

Production-Ready Example with Error Handling

The minimal example works for quick experiments, but production code should validate input, handle errors, and manage timeouts. Here is a complete example:

Production-ready protein folding
import os
import sys
from scirouter import SciRouter
from scirouter.exceptions import SciRouterError, ValidationError, TimeoutError

# Validate environment
api_key = os.environ.get("SCIROUTER_API_KEY")
if not api_key:
    print("Error: Set the SCIROUTER_API_KEY environment variable")
    sys.exit(1)

client = SciRouter(api_key=api_key)

VALID_AA = set("ACDEFGHIKLMNPQRSTVWY")
MAX_LENGTH = 1024

def fold_protein(sequence: str) -> dict:
    """Fold a protein sequence and return structured results."""
    # Validate input
    sequence = sequence.strip().upper()
    invalid = set(sequence) - VALID_AA
    if invalid:
        raise ValueError(f"Invalid amino acids: {invalid}")
    if len(sequence) > MAX_LENGTH:
        raise ValueError(f"Sequence too long: {len(sequence)} > {MAX_LENGTH}")
    if len(sequence) < 10:
        raise ValueError("Sequence too short: minimum 10 residues")

    # Call the API
    try:
        result = client.proteins.fold(
            sequence=sequence,
            model="esmfold",
            timeout=120,
        )
    except ValidationError as e:
        print(f"Input rejected by API: {e}")
        raise
    except TimeoutError:
        print("Prediction timed out — try a shorter sequence")
        raise
    except SciRouterError as e:
        print(f"API error: {e}")
        raise

    return {
        "pdb": result.pdb,
        "mean_plddt": result.mean_plddt,
        "residue_scores": result.plddt_per_residue,
        "job_id": result.job_id,
    }

if __name__ == "__main__":
    seq = "MVLSPADKTNVKAAWGKVGAHAGEYGAEALERMFLSFPTTKTYFPHFDLSH"
    output = fold_protein(seq)
    print(f"Mean pLDDT: {output['mean_plddt']:.1f}")

    with open("prediction.pdb", "w") as f:
        f.write(output["pdb"])
    print("Structure saved to prediction.pdb")

Batch Folding: Multiple Sequences

For batch processing, submit multiple fold requests concurrently. The API handles each as an independent job. Here is a pattern using concurrent futures:

Batch fold multiple sequences
from concurrent.futures import ThreadPoolExecutor, as_completed
from scirouter import SciRouter

client = SciRouter()

sequences = {
    "hemoglobin_alpha": "MVLSPADKTNVKAAWGKVGAHAGEYGAEALERMFLSFPTTKTYFPHFDLSH",
    "insulin_b_chain": "FVNQHLCGSHLVEALYLVCGERGFFYTPKT",
    "lysozyme_fragment": "KVFGRCELAAALKRHGLDNYRGYSLGNWVCAAK",
}

def fold_one(name, seq):
    result = client.proteins.fold(sequence=seq)
    return name, result

results = {}
with ThreadPoolExecutor(max_workers=5) as pool:
    futures = {pool.submit(fold_one, n, s): n for n, s in sequences.items()}
    for future in as_completed(futures):
        name, result = future.result()
        results[name] = result
        print(f"{name}: pLDDT = {result.mean_plddt:.1f}")
        with open(f"{name}.pdb", "w") as f:
            f.write(result.pdb)

print(f"Folded {len(results)} proteins")

Visualizing the Output

The PDB file you receive can be opened in any molecular viewer. For quick inspection in a Jupyter notebook, NGLview renders the structure inline:

Visualize in Jupyter
# pip install nglview
import nglview as nv

view = nv.show_file("prediction.pdb")
view.add_representation("cartoon", color="bfactor")  # color by pLDDT
view

Coloring by B-factor maps the pLDDT values to a color gradient: blue regions are high-confidence, red regions are low-confidence. This gives you immediate visual feedback on which parts of the structure are reliable.

Using the REST API Directly

If you prefer not to use the SDK, you can call the REST endpoint directly with any HTTP client. This is useful for non-Python environments or when integrating into existing infrastructure:

Direct REST API call
import os, requests, time

API_KEY = os.environ["SCIROUTER_API_KEY"]
BASE = "https://api.scirouter.ai/v1"
headers = {"Authorization": f"Bearer {API_KEY}"}

# Submit the folding job
resp = requests.post(
    f"{BASE}/proteins/fold",
    headers=headers,
    json={"sequence": "MVLSPADKTNVKAAWGKVGAHAGEYGAEALERMFLSFPTTKTYFPHFDLSH", "model": "esmfold"},
)
job_id = resp.json()["job_id"]

# Poll until complete
while True:
    result = requests.get(f"{BASE}/proteins/fold/{job_id}", headers=headers).json()
    if result["status"] == "completed":
        break
    if result["status"] == "failed":
        raise RuntimeError(result.get("error", "Unknown error"))
    time.sleep(3)

print(f"pLDDT: {result['mean_plddt']:.1f}")
open("structure.pdb", "w").write(result["pdb"])

When to Use ESMFold vs Other Models

ESMFold is the right choice when you need fast, single-chain structure prediction and do not require multi-chain complex modeling. Here is a quick decision guide:

  • ESMFold: Single chains, fast turnaround (5-15s), no MSA needed. Best for screening, pipelines, and quick checks.
  • Boltz-2: Multi-chain complexes, protein-ligand interactions, protein-DNA/RNA. Slower but handles complex inputs.
  • AlphaFold2: Highest accuracy single-chain prediction when you can wait for MSA computation. Not available via SciRouter yet.

Next Steps

Now that you can predict protein structures programmatically, explore related tools on SciRouter. Use ESMFold for structure prediction, then feed the PDB output into molecular docking with DiffDock or design optimized sequences with ProteinMPNN.

Sign up for a free SciRouter API key at scirouter.ai/register and start predicting protein structures in under two minutes. No GPU required.

Frequently Asked Questions

Do I need a GPU to predict protein structure with this API?

No. SciRouter runs ESMFold on A100 GPUs in the cloud. Your code runs on any machine — a laptop, a CI server, or a Raspberry Pi. You send the sequence over HTTPS and get the structure back.

How accurate is ESMFold compared to AlphaFold2?

ESMFold achieves a median GDT-TS within 5 points of AlphaFold2 on CASP15 targets for single-chain proteins. It is less accurate on multi-domain proteins and complexes, but it is 60x faster because it does not require a multiple sequence alignment step.

What is the maximum sequence length the API accepts?

The SciRouter ESMFold endpoint accepts sequences up to 1024 amino acid residues. Sequences longer than 1024 residues will receive a 422 validation error. For longer proteins, consider splitting into domains or using Boltz-2.

How much does it cost?

Free tier accounts receive 500 credits per month with no credit card required. Each ESMFold prediction costs 1 credit. Pro tier accounts have higher limits and priority queuing.

Can I use this in a production pipeline?

Yes. The API returns standard PDB files and JSON metadata. It supports concurrent requests, has 99.9 percent uptime SLA on the Pro tier, and returns results in 5 to 30 seconds depending on sequence length.

Try It Free

No Login Required

Try this yourself

500 free credits. No credit card required.