What LangChain version do I need?

You need LangChain v0.2 or later. The examples in this guide use the langchain-core and langchain packages. Install with pip install langchain langchain-openai.

Can I use other LLMs besides OpenAI?

Yes. LangChain supports dozens of LLM providers including Anthropic Claude, Google Gemini, Mistral, and local models via Ollama. Swap the ChatOpenAI class for your preferred provider — the SciRouter tool definitions remain the same.

How do I handle async folding jobs?

Protein folding and molecular docking are GPU-intensive and return asynchronously. The SciRouter tool wrapper in this guide includes a polling loop that waits for job completion before returning results to the agent. You can adjust the polling interval and timeout to suit your needs.

Is this production-ready?

The patterns shown here are suitable for research prototypes and internal tools. For production deployments, add error handling, retry logic, structured logging, and rate limit awareness. Consider using LangChain's callback system for observability.

Cost for a typical agent workflow?

A typical drug discovery workflow that folds one protein, calculates properties for 10 compounds, and docks 3 candidates uses roughly 15 SciRouter API credits. With the free tier providing 500 credits per month, you can run about 30 full workflows at no cost.

How to Use SciRouter with LangChain: AI Agent Drug Discovery

Why AI Agents Need Access to Science Tools

Large language models can reason about biology, interpret research papers, and propose hypotheses. But they cannot compute. An LLM cannot fold a protein, calculate a binding affinity, or predict ADMET properties from first principles. To bridge this gap, you need to give your agent access to real scientific computing tools — and that is exactly what SciRouter provides.

By combining LangChain's agent framework with SciRouter's unified science API, you can build agents that accept natural language drug discovery queries and autonomously orchestrate protein folding, molecular property calculation, and molecular docking across multiple API calls.

Setting Up LangChain with SciRouter

First, install the required packages. You need LangChain for the agent framework, an LLM provider, and the requests library for calling the SciRouter API.

Install dependencies

pip install langchain langchain-openai requests

Next, define your SciRouter API tools as LangChain tool functions. Each tool wraps a SciRouter API endpoint and handles the submit-poll pattern for async jobs like protein folding and molecular docking.

Step 1: Define SciRouter Tools for LangChain

Define SciRouter tools as LangChain tools

import requests
import time
from langchain_core.tools import tool

API_KEY = "sk-sci-your-api-key"
BASE = "https://api.scirouter.ai/v1"
HEADERS = {"Authorization": f"Bearer {API_KEY}"}

@tool
def fold_protein(sequence: str) -> dict:
    """Predict the 3D structure of a protein from its amino acid sequence using ESMFold."""
    resp = requests.post(f"{BASE}/proteins/fold",
        headers=HEADERS,
        json={"sequence": sequence, "model": "esmfold"})
    job_id = resp.json()["job_id"]
    while True:
        result = requests.get(f"{BASE}/proteins/fold/{job_id}", headers=HEADERS).json()
        if result["status"] == "completed":
            return {"pdb": result["pdb"][:200] + "...", "mean_plddt": result["mean_plddt"]}
        if result["status"] == "failed":
            return {"error": result["error"]}
        time.sleep(3)

@tool
def calculate_properties(smiles: str) -> dict:
    """Calculate molecular properties (weight, logP, TPSA, etc.) from a SMILES string."""
    resp = requests.post(f"{BASE}/chemistry/properties",
        headers=HEADERS, json={"smiles": smiles})
    return resp.json()

@tool
def dock_molecule(smiles: str, protein_pdb: str) -> dict:
    """Dock a small molecule (SMILES) against a protein structure (PDB) using DiffDock."""
    resp = requests.post(f"{BASE}/docking/diffdock",
        headers=HEADERS,
        json={"ligand_smiles": smiles, "protein_pdb": protein_pdb})
    job_id = resp.json()["job_id"]
    while True:
        result = requests.get(f"{BASE}/docking/{job_id}", headers=HEADERS).json()
        if result["status"] == "completed":
            return {"confidence": result["confidence"], "poses": len(result["poses"])}
        if result["status"] == "failed":
            return {"error": result["error"]}
        time.sleep(3)

Tip

The fold_protein and dock_molecule tools include polling loops because these are GPU-intensive operations that run asynchronously. The agent waits for completion before proceeding to the next step.

Step 2: Build the Drug Discovery Agent

With tools defined, create a LangChain agent that can reason about which tools to call and in what order. The agent uses ReAct-style prompting to think step-by-step through a drug discovery workflow.

Create a drug discovery agent

from langchain_openai import ChatOpenAI
from langchain.agents import AgentExecutor, create_tool_calling_agent
from langchain_core.prompts import ChatPromptTemplate

llm = ChatOpenAI(model="gpt-4o", temperature=0)
tools = [fold_protein, calculate_properties, dock_molecule]

prompt = ChatPromptTemplate.from_messages([
    ("system", """You are a drug discovery research assistant with access
to scientific computing tools. Break complex requests into steps:
1. Identify what computations are needed
2. Call the appropriate tools in logical order
3. Interpret results and provide actionable insights"""),
    ("human", "{input}"),
    ("placeholder", "{agent_scratchpad}"),
])

agent = create_tool_calling_agent(llm, tools, prompt)
executor = AgentExecutor(agent=agent, tools=tools, verbose=True)

# Run a drug discovery query
result = executor.invoke({
    "input": "Fold the insulin B-chain (FVNQHLCGSHLVEALYLVCGERGFFYTPKT) "
             "and tell me about its predicted structure quality."
})
print(result["output"])

Step 3: Full Pipeline — Multi-Step Drug Discovery

The real power emerges when the agent orchestrates multiple tools in sequence. Here is an example that takes a natural language query and runs a complete screening workflow: property calculation, filtering, and docking.

Full multi-step pipeline

result = executor.invoke({
    "input": """I'm researching COX-2 inhibitors. Please:
1. Calculate molecular properties for these candidates:
   - CC(=O)Oc1ccccc1C(=O)O (aspirin)
   - Cc1ccc(-c2cc(C(F)(F)F)nn2-c2ccc(S(N)(=O)=O)cc2)cc1 (celecoxib)
2. Compare their drug-likeness (Lipinski rules)
3. Recommend which compound is more promising and explain why"""
})
print(result["output"])

Note

The agent will automatically call calculate_properties for each compound, compare the results against Lipinski's Rule of Five, and provide a reasoned recommendation — all from a single natural language prompt.

Agent Architectures: ReAct vs Plan-and-Execute

LangChain supports multiple agent architectures. For science workflows, two patterns stand out:

ReAct (Reasoning + Acting)

The agent alternates between thinking and tool calls. It observes each result before deciding the next action. This works well for exploratory workflows where later steps depend on earlier results — for example, choosing which compounds to dock based on property calculations.

Plan-and-Execute

The agent creates a complete plan upfront, then executes each step. This is more efficient for well-defined pipelines where the steps are known in advance. However, it is less flexible when intermediate results should change the plan.

Use ReAct for open-ended discovery tasks where the agent needs to adapt based on results
Use Plan-and-Execute for standardized screening pipelines with predictable steps
Use ReAct with tool retries when dealing with async jobs that may fail or timeout

Available Science Tools

SciRouter exposes a growing catalog of scientific computing tools that you can wire into your LangChain agents:

ESMFold — Protein structure prediction from sequence (5-30 seconds)
DiffDock — AI-powered molecular docking without predefined search boxes
Molecular Properties — Calculate drug-likeness, logP, TPSA, and more from SMILES

See the full list of available tools on the SciRouter tools catalog.

Next Steps

You now have the building blocks to create AI agents that can reason about and execute scientific computing tasks. To go further, consider adding memory so the agent retains context across sessions, integrating with vector databases for literature search, or connecting SciRouter via MCP for seamless tool discovery.

For agent-native tool discovery without writing wrapper code, see our guide on connecting SciRouter to Claude via MCP. Or sign up for a free SciRouter API key to start building today.

How to Use SciRouter with LangChain: AI Agent Drug Discovery

Why AI Agents Need Access to Science Tools

Setting Up LangChain with SciRouter

Step 1: Define SciRouter Tools for LangChain

Step 2: Build the Drug Discovery Agent

Step 3: Full Pipeline — Multi-Step Drug Discovery

Agent Architectures: ReAct vs Plan-and-Execute

ReAct (Reasoning + Acting)

Plan-and-Execute

Available Science Tools

Next Steps

Frequently Asked Questions

What LangChain version do I need?

Can I use other LLMs besides OpenAI?

How do I handle async folding jobs?

Is this production-ready?

Cost for a typical agent workflow?

Related Tools

ESMFold — Protein Structure Prediction

DiffDock — AI Molecular Docking

Molecular Properties — RDKit

More in the AI Agents Series

Building an AI Science Agent with MCP

Model Context Protocol Explained for Scientists

How to Give Your LLM Access to Scientific Computing

Try this yourself