Therapeutics LLMTherapeutics LLM

Agentic Drug Discovery with TxGemma and MCP

How to build an autonomous drug discovery agent using TxGemma through the MCP server, chained with ADMET and Boltz-2.

SciRouter Team
April 11, 2026
13 min read

Agentic drug discovery is the pattern where an LLM orchestrates a pipeline of scientific tools instead of a human gluing them together. The agent reads a question, decides which tools to call, calls them, reads the outputs, and iterates. For this pattern to work you need a way for the agent to discover and invoke tools without bespoke per-tool glue code. That way is the Model Context Protocol.

This tutorial shows how to build an agentic drug-discovery pipeline on top of TxGemma and other SciRouter tools using MCP. The target workflow: given a protein target and a hit list, the agent uses TxGemma to triage, Boltz-2 to validate structurally, and an ADMET model to profile developability — then returns a ranked shortlist.

Note
If you are new to the Model Context Protocol, think of it as function calling that is portable across agent frameworks. You run an MCP server, it advertises a set of tools, and any MCP-aware agent can call them.

The agent architecture

A minimal agentic drug-discovery loop has four parts:

  • The agent. A general-purpose LLM like Claude or GPT that orchestrates tool calls. It is the conductor, not the specialist.
  • The MCP server. A SciRouter MCP endpoint that exposes scientific tools with typed schemas. The agent connects here once.
  • The scientific models. TxGemma, Boltz-2, DiffDock, ESMFold, ADMET predictors, and the rest of the SciRouter catalog. Each is a tool the agent can call.
  • The human. You are still in the loop. The agent proposes, the human reviews, the human decides what goes to wet lab.

The SciRouter MCP server lives at mcp.scirouter.ai and advertises all the scientific tools we host. Your agent only needs to know about that one endpoint.

Connecting an agent to the SciRouter MCP server

The exact connection code depends on which agent framework you are using. The pattern is the same: point the agent at the MCP URL, pass your SciRouter API key, and let the agent discover tools. Here is the shape of a minimal Python connection using the official MCP SDK:

python
import os
from mcp import ClientSession
from mcp.client.sse import sse_client

API_KEY = os.environ["SCIROUTER_API_KEY"]
MCP_URL = "https://mcp.scirouter.ai/sse"

async def connect_and_list_tools():
    async with sse_client(
        MCP_URL,
        headers={"Authorization": f"Bearer {API_KEY}"},
    ) as (read, write):
        async with ClientSession(read, write) as session:
            await session.initialize()
            tools = await session.list_tools()
            for t in tools.tools:
                print(f"- {t.name}: {t.description}")

Running that gives you the tool catalog. You will see TxGemma listed alongside Boltz-2, DiffDock, ESMFold, and the ADMET endpoints. Each tool has a typed schema so the agent knows exactly what inputs to send.

The triage loop

Here is the loop at the heart of the pipeline, written in natural language for the agent. In practice you would put this in the agent's system prompt.

text
You are a drug-discovery triage agent. You have access to SciRouter
tools through MCP: txgemma, boltz2, diffdock, admet_panel.

Given a target protein and a list of candidate SMILES:

1. For each candidate, call txgemma with a structured ADMET and hERG
   reasoning prompt. Record the rationale.
2. Filter out candidates with high hERG or poor absorption verdicts.
3. For surviving candidates, call boltz2 with the target and the
   candidate. Record the predicted binding pose and score.
4. For the top 5 by score, call admet_panel to get calibrated
   numeric predictions.
5. Return a ranked shortlist with links to the supporting evidence.

If any call fails, retry once. If it fails twice, skip the
candidate and note it in the final report.

That prompt gives the agent enough structure to drive the pipeline on its own. A good agent will produce a shortlist plus the reasoning trail. A mediocre agent will need more hand holding. Either way, you get a reproducible workflow.

Chaining TxGemma and Boltz-2

The most interesting part of this loop is the handoff from TxGemma to Boltz-2. TxGemma reasons about which scaffolds deserve a structural calculation. Boltz-2 then runs the calculation. Neither tool alone is enough. Together they are meaningfully better than either.

  • TxGemma is fast and cheap. You can run it on hundreds of candidates in a few minutes.
  • Boltz-2 is slow and expensive. You want to run it on the best candidates only.
  • Letting the agent use TxGemma as a filter keeps Boltz-2 GPU time focused where it matters.

If you try to run Boltz-2 on every molecule in a hit list you will blow through your GPU budget in a day. If you run only TxGemma, you will miss the structural information that tells you whether a molecule actually fits the pocket. The combination is the right default.

Hooking the agent up to Claude or GPT

Claude has native MCP client support. You configure it with the SciRouter MCP URL in your Claude client settings and it will discover the tools automatically. From there, any conversation that mentions drug discovery can trigger tool calls without you writing any glue code.

For GPT, use an MCP-to-OpenAI adapter (several exist) that turns the MCP tool schemas into OpenAI function-calling schemas. The net effect is the same: the agent sees the SciRouter tools in its tool list and calls them as part of normal reasoning.

For LangChain or other agent frameworks, use the MCP client libraries listed in the MCP documentation. All of them expose the same pattern.

Warning
Agentic pipelines can burn through tokens and GPU hours quickly. Always run a new agent against a small test set first. Measure how many tool calls it makes per question. Set rate limits at the SciRouter key level before letting the agent loose on your full hit list.

Cost control

Three controls keep agentic pipelines from blowing up your budget:

  • Per-key rate limits. SciRouter enforces monthly quotas and short-window rate limits. Start restrictive and raise as needed.
  • Tool-level caching. If the same SMILES hits the same tool twice, return the cached result. SciRouter caches common responses automatically.
  • Early exits. Tell the agent in its system prompt to stop as soon as it has enough evidence to answer the question. Many agent loops burn tokens on redundant confirmation calls.

Bottom line

MCP turns SciRouter into an agent-native platform. Any MCP-aware LLM can discover TxGemma and the other tools, call them with typed inputs, and chain them into a full drug-discovery triage loop. You get the reasoning of a chemistry-specialist LLM combined with the quantitative output of physical tools, in a pipeline the agent orchestrates end to end.

Start with TxGemma → or connect your agent to the SciRouter MCP server →

Frequently Asked Questions

What is the Model Context Protocol?

MCP is an open protocol, championed by Anthropic, that lets language-model agents discover and call external tools in a standard way. An MCP server exposes a set of tools with typed schemas, and an MCP-aware client (Claude, GPT, and several open frameworks) can connect to the server, read the tool list, and invoke tools as part of its reasoning. For drug discovery that means a single connection gives the agent access to all of SciRouter's scientific models.

Why use MCP for drug discovery instead of direct API calls?

Direct API calls work fine when you control both the code and the model. MCP matters when the reasoning is happening inside an LLM that you did not write. An agent running inside Claude or GPT can discover the SciRouter MCP server, see that TxGemma and Boltz-2 are available, and call them without any manual glue code on your side. The protocol is what makes scientific tools first-class citizens inside generalist agents.

What does an agentic drug discovery loop actually look like?

A simple loop: the agent receives a target and a question, calls TxGemma to reason about what scaffolds might bind, calls Boltz-2 to validate the top candidates structurally, calls an ADMET predictor to profile the survivors, and returns a ranked list. More advanced loops add retrieval, synthesis planning, and human-in-the-loop review. The point is that each call is a discrete tool invocation the agent can chain.

Do I need to run my own MCP server?

No. SciRouter hosts an MCP server at mcp.scirouter.ai that exposes all our scientific tools including TxGemma, Boltz-2, DiffDock, ESMFold, and ADMET predictors. You connect your agent to that endpoint with your API key and you are ready. Running your own server only makes sense if you want to combine SciRouter tools with proprietary internal tools.

Which agents work with MCP today?

Claude is the most mature MCP consumer — it has native client support. OpenAI's assistants can call MCP servers through a small adapter. Open frameworks like LangChain, LlamaIndex, and the Anthropic Agent SDK all have MCP clients. If your agent framework does not yet support MCP, you can still call SciRouter endpoints directly through the REST API.

How do I keep agents from burning through my GPU budget?

Rate limit at the gateway, not at the agent. SciRouter enforces per-key rate limits and monthly quotas. For agentic workflows, start with a restrictive tier on a test key, profile how many tool calls the agent actually makes, and scale up. We also recommend caching: if the same SMILES hits TxGemma twice in one run, the second call should be a cache hit.

Is TxGemma fast enough for a real-time agent loop?

TxGemma 9B through SciRouter responds in a few seconds for a typical ADMET prompt. That is fast enough for an interactive agent loop with a handful of tool calls per question. For very latency-sensitive use cases, call TxGemma 2B instead of 9B. For highest quality, call 27B and tolerate the wait.

Try this yourself

500 free credits. No credit card required.