IntegrationsAI Agents

How to Use SciRouter with LangChain: AI Agent Drug Discovery

Build an AI drug discovery agent with LangChain and SciRouter. Fold proteins, dock molecules, and screen compounds — all orchestrated by an LLM.

Ryan Bethencourt
March 27, 2026
10 min read

Why AI Agents Need Access to Science Tools

Large language models can reason about biology, interpret research papers, and propose hypotheses. But they cannot compute. An LLM cannot fold a protein, calculate a binding affinity, or predict ADMET properties from first principles. To bridge this gap, you need to give your agent access to real scientific computing tools — and that is exactly what SciRouter provides.

By combining LangChain's agent framework with SciRouter's unified science API, you can build agents that accept natural language drug discovery queries and autonomously orchestrate protein folding, molecular property calculation, and molecular docking across multiple API calls.

Setting Up LangChain with SciRouter

First, install the required packages. You need LangChain for the agent framework, an LLM provider, and the requests library for calling the SciRouter API.

Install dependencies
pip install langchain langchain-openai requests

Next, define your SciRouter API tools as LangChain tool functions. Each tool wraps a SciRouter API endpoint and handles the submit-poll pattern for async jobs like protein folding and molecular docking.

Step 1: Define SciRouter Tools for LangChain

Define SciRouter tools as LangChain tools
import requests
import time
from langchain_core.tools import tool

API_KEY = "sk-sci-your-api-key"
BASE = "https://api.scirouter.ai/v1"
HEADERS = {"Authorization": f"Bearer {API_KEY}"}

@tool
def fold_protein(sequence: str) -> dict:
    """Predict the 3D structure of a protein from its amino acid sequence using ESMFold."""
    resp = requests.post(f"{BASE}/proteins/fold",
        headers=HEADERS,
        json={"sequence": sequence, "model": "esmfold"})
    job_id = resp.json()["job_id"]
    while True:
        result = requests.get(f"{BASE}/proteins/fold/{job_id}", headers=HEADERS).json()
        if result["status"] == "completed":
            return {"pdb": result["pdb"][:200] + "...", "mean_plddt": result["mean_plddt"]}
        if result["status"] == "failed":
            return {"error": result["error"]}
        time.sleep(3)

@tool
def calculate_properties(smiles: str) -> dict:
    """Calculate molecular properties (weight, logP, TPSA, etc.) from a SMILES string."""
    resp = requests.post(f"{BASE}/chemistry/properties",
        headers=HEADERS, json={"smiles": smiles})
    return resp.json()

@tool
def dock_molecule(smiles: str, protein_pdb: str) -> dict:
    """Dock a small molecule (SMILES) against a protein structure (PDB) using DiffDock."""
    resp = requests.post(f"{BASE}/docking/diffdock",
        headers=HEADERS,
        json={"ligand_smiles": smiles, "protein_pdb": protein_pdb})
    job_id = resp.json()["job_id"]
    while True:
        result = requests.get(f"{BASE}/docking/{job_id}", headers=HEADERS).json()
        if result["status"] == "completed":
            return {"confidence": result["confidence"], "poses": len(result["poses"])}
        if result["status"] == "failed":
            return {"error": result["error"]}
        time.sleep(3)
Tip
The fold_protein and dock_molecule tools include polling loops because these are GPU-intensive operations that run asynchronously. The agent waits for completion before proceeding to the next step.

Step 2: Build the Drug Discovery Agent

With tools defined, create a LangChain agent that can reason about which tools to call and in what order. The agent uses ReAct-style prompting to think step-by-step through a drug discovery workflow.

Create a drug discovery agent
from langchain_openai import ChatOpenAI
from langchain.agents import AgentExecutor, create_tool_calling_agent
from langchain_core.prompts import ChatPromptTemplate

llm = ChatOpenAI(model="gpt-4o", temperature=0)
tools = [fold_protein, calculate_properties, dock_molecule]

prompt = ChatPromptTemplate.from_messages([
    ("system", """You are a drug discovery research assistant with access
to scientific computing tools. Break complex requests into steps:
1. Identify what computations are needed
2. Call the appropriate tools in logical order
3. Interpret results and provide actionable insights"""),
    ("human", "{input}"),
    ("placeholder", "{agent_scratchpad}"),
])

agent = create_tool_calling_agent(llm, tools, prompt)
executor = AgentExecutor(agent=agent, tools=tools, verbose=True)

# Run a drug discovery query
result = executor.invoke({
    "input": "Fold the insulin B-chain (FVNQHLCGSHLVEALYLVCGERGFFYTPKT) "
             "and tell me about its predicted structure quality."
})
print(result["output"])

Step 3: Full Pipeline — Multi-Step Drug Discovery

The real power emerges when the agent orchestrates multiple tools in sequence. Here is an example that takes a natural language query and runs a complete screening workflow: property calculation, filtering, and docking.

Full multi-step pipeline
result = executor.invoke({
    "input": """I'm researching COX-2 inhibitors. Please:
1. Calculate molecular properties for these candidates:
   - CC(=O)Oc1ccccc1C(=O)O (aspirin)
   - Cc1ccc(-c2cc(C(F)(F)F)nn2-c2ccc(S(N)(=O)=O)cc2)cc1 (celecoxib)
2. Compare their drug-likeness (Lipinski rules)
3. Recommend which compound is more promising and explain why"""
})
print(result["output"])
Note
The agent will automatically call calculate_properties for each compound, compare the results against Lipinski's Rule of Five, and provide a reasoned recommendation — all from a single natural language prompt.

Agent Architectures: ReAct vs Plan-and-Execute

LangChain supports multiple agent architectures. For science workflows, two patterns stand out:

ReAct (Reasoning + Acting)

The agent alternates between thinking and tool calls. It observes each result before deciding the next action. This works well for exploratory workflows where later steps depend on earlier results — for example, choosing which compounds to dock based on property calculations.

Plan-and-Execute

The agent creates a complete plan upfront, then executes each step. This is more efficient for well-defined pipelines where the steps are known in advance. However, it is less flexible when intermediate results should change the plan.

  • Use ReAct for open-ended discovery tasks where the agent needs to adapt based on results
  • Use Plan-and-Execute for standardized screening pipelines with predictable steps
  • Use ReAct with tool retries when dealing with async jobs that may fail or timeout

Available Science Tools

SciRouter exposes a growing catalog of scientific computing tools that you can wire into your LangChain agents:

  • ESMFold — Protein structure prediction from sequence (5-30 seconds)
  • DiffDock — AI-powered molecular docking without predefined search boxes
  • Molecular Properties — Calculate drug-likeness, logP, TPSA, and more from SMILES

See the full list of available tools on the SciRouter tools catalog.

Next Steps

You now have the building blocks to create AI agents that can reason about and execute scientific computing tasks. To go further, consider adding memory so the agent retains context across sessions, integrating with vector databases for literature search, or connecting SciRouter via MCP for seamless tool discovery.

For agent-native tool discovery without writing wrapper code, see our guide on connecting SciRouter to Claude via MCP. Or sign up for a free SciRouter API key to start building today.

Frequently Asked Questions

What LangChain version do I need?

You need LangChain v0.2 or later. The examples in this guide use the langchain-core and langchain packages. Install with pip install langchain langchain-openai.

Can I use other LLMs besides OpenAI?

Yes. LangChain supports dozens of LLM providers including Anthropic Claude, Google Gemini, Mistral, and local models via Ollama. Swap the ChatOpenAI class for your preferred provider — the SciRouter tool definitions remain the same.

How do I handle async folding jobs?

Protein folding and molecular docking are GPU-intensive and return asynchronously. The SciRouter tool wrapper in this guide includes a polling loop that waits for job completion before returning results to the agent. You can adjust the polling interval and timeout to suit your needs.

Is this production-ready?

The patterns shown here are suitable for research prototypes and internal tools. For production deployments, add error handling, retry logic, structured logging, and rate limit awareness. Consider using LangChain's callback system for observability.

Cost for a typical agent workflow?

A typical drug discovery workflow that folds one protein, calculates properties for 10 compounds, and docks 3 candidates uses roughly 15 SciRouter API credits. With the free tier providing 500 credits per month, you can run about 30 full workflows at no cost.

Try this yourself

500 free credits. No credit card required.