Can an LLM reliably predict drug binding?

Not as a replacement for a physical docking or free-energy calculation. What a chemistry-literate LLM can do is explain expected binding behavior, flag likely liabilities, and recall known SAR for related scaffolds. Use it alongside a structural predictor like Boltz-2 or DiffDock for the quantitative answer.

What should I ask TxGemma about hERG?

Ask for a structural rationale rather than a probability. A good prompt is: 'Does this scaffold have features associated with hERG liability? If yes, which functional groups are driving the concern and what structural changes would most likely mitigate it?' That prompt shape gets you an explanation you can act on, not a number you cannot verify.

How do I ask about BBB penetration?

The useful question is not 'does this cross the blood-brain barrier' but 'what properties of this molecule determine its BBB behavior and what changes would improve it'. TxGemma responds to the second prompt with specific comments about molecular weight, logP, hydrogen bond donor count, and topological polar surface area. That is what you can actually design against.

Can TxGemma predict off-target effects?

TxGemma can flag similarity to scaffolds known to hit common off-targets like kinases, GPCRs, and ion channels, and it can explain why it thinks so. It cannot give you a quantitative selectivity profile. For that you want a dedicated selectivity model or a panel of docking runs. TxGemma is the triage layer, not the final answer.

Does TxGemma know my target?

It depends on the target. Well-studied targets like kinases, nuclear receptors, and common enzymes are covered in the training data. Rare or newly described targets may not be. When in doubt, provide context in the prompt — target name, UniProt ID, and a short description of the binding site. TxGemma uses that context to ground its answer.

How should I combine TxGemma with Boltz-2?

Use TxGemma to triage and Boltz-2 to quantify. For each molecule in your hit list, ask TxGemma for an expected binding rationale and any liabilities. Then run Boltz-2 on the molecules TxGemma flagged as promising. The two steps are complementary: the LLM compresses a lot of medicinal chemistry knowledge, the physical model gives you the numbers.

What is the biggest mistake people make prompting TxGemma?

Asking for numeric predictions as if TxGemma were a QSAR model. It is not. Ask it for reasoned verdicts and structural rationales, and pair those with a calibrated numeric model. Prompts like 'what is the Ki of this molecule against kinase X' produce confident-sounding nonsense. Prompts like 'what features of this scaffold would affect its Ki against kinase X' produce useful answers.

Ask an LLM About Drug Binding: The TxGemma Practical Guide

There is a practical difference between “asking an LLM about drug binding” and “using an LLM as a binding predictor.” The first is useful. The second is a mistake. This guide is about the useful version: how to prompt TxGemma so that you get the kind of structured, reasoned answer a medicinal chemist would give you at a design review.

We will cover prompts for four of the questions that come up most often in early-stage drug discovery: hERG liability, blood-brain barrier penetration, off-target effects, and synthesis. For each one there is a good way to ask and a bad way to ask, and the difference is large.

Note

TxGemma is a reasoning model. It compresses a lot of medicinal chemistry knowledge into a chat-shaped interface. It is not a calibrated numeric predictor, and it will not replace docking, alchemical free-energy perturbation, or an ADMET panel. It complements them.

The general shape of a good TxGemma prompt

Before we get to specific questions, here is the template that produces the most useful output:

Give TxGemma the SMILES string of the molecule.
Give it context: target name, assay, any relevant observations.
Ask for a verdict (categorical, not numeric) and a rationale grounded in the structure.
Ask what structural changes would most likely move the verdict in the direction you want.

This template works because it matches how TxGemma was instruction-tuned. The model has seen thousands of prompts in that shape and knows how to respond.

Asking about hERG liability

Cardiac hERG channel blockade is one of the most common off-target liabilities that kills drug candidates in preclinical testing. A lot of scaffolds have hERG problems, and a lot of remediation strategies are well known.

Bad prompt

text

What is the hERG IC50 of this molecule?

Good prompt

text

SMILES: <your smiles here>

Does this scaffold have structural features associated with hERG
liability? Give a categorical verdict (low/moderate/high concern),
explain which functional groups or physicochemical properties are
driving the concern, and suggest two structural changes that would
most likely reduce the hERG risk.

The bad prompt asks for a number. TxGemma will produce one, and it will be wrong. The good prompt asks for a rationale and a design strategy. TxGemma will return something like: “ moderate concern, driven by the basic amine and extended lipophilic core, and mitigation would come from reducing basicity or breaking coplanarity with an ortho substituent.” That answer is actionable.

Asking about BBB penetration

Central nervous system drug discovery lives and dies on the blood-brain barrier. Whether a molecule crosses the barrier is driven by a cluster of physicochemical properties — molecular weight, logP, H-bond donor count, topological polar surface area — and by active efflux via transporters like P-gp.

Bad prompt

text

Does this molecule cross the blood-brain barrier?

Good prompt

text

SMILES: <your smiles here>

I need this compound to be CNS-penetrant. Review the physicochemical
properties that drive BBB permeability for this molecule (MW, logP, HBD,
HBA, TPSA) and comment on likely P-gp efflux liability based on
structural motifs. Suggest three structural changes that would most
likely improve CNS exposure without losing target engagement.

The good prompt names the relevant properties. TxGemma then grounds its answer in those properties, tells you which are borderline, and proposes specific changes. You get a medicinal-chemistry briefing, not a yes/no answer that pretends to be certain.

Asking about off-target effects

Off-target promiscuity is what generates most of the selectivity work in late-stage optimization. Common off-targets include kinase panels, ion channels, nuclear receptors, and CYP isoforms. A scaffold that hits one target is likely to hit others if it has the right shape.

Good prompt

text

SMILES: <your smiles here>
Primary target: <target name>

Does this scaffold resemble known pharmacophores for any common
off-targets (kinases, GPCRs, ion channels, nuclear receptors)? If yes,
name the off-target families, explain the structural similarity, and
suggest whether a selectivity problem is likely. Do not give numeric
selectivity scores.

Notice the explicit instruction not to give numeric selectivity scores. This is a useful prompt hygiene habit. If you do not forbid numeric output, TxGemma will sometimes produce confident numbers that look like calibrated predictions but are not. Forbidding them forces the model into the explanation mode it is actually good at.

Asking about synthesis

Retrosynthesis is the classic medicinal chemistry exercise: given a target molecule, propose reasonable disconnections backwards to commercially available starting materials. TxGemma was trained on retrosynthesis datasets in the TDC instruction mixture, so it has seen thousands of examples.

Good prompt

text

SMILES: <your smiles here>

Propose a three-step retrosynthesis for this molecule. For each step,
name the reaction type, identify the bond being disconnected, and
comment on the expected yield and any stereochemistry concerns. Do not
use reactions that require unusual transition metal catalysts.

TxGemma will respond with something that reads like a senior graduate student at a board. Treat it as a first draft — you still need a real chemist to confirm that the proposed route is viable, and you still need retrosynthesis software for production planning. But as a thought partner it is fast and surprisingly literate.

Composing the prompts into a triage pipeline

In practice you will want to chain these prompts into a single triage pipeline. For each hit in your list you ask about hERG, BBB, off-targets, and synthesis, and you collect the answers into a profile that your chemists can review.

Loop over your hit list.
For each molecule, fire all four prompts in parallel.
Aggregate the responses into a single profile object per molecule.
Flag molecules where TxGemma is most enthusiastic.
Route the top candidates to Boltz-2 for a binding calculation and to your ADMET predictor for numeric confirmation.

This gives you a fast, cheap first-pass triage that includes a reasoning trail you can show to chemists. It is not a substitute for experimental data, but it meaningfully reduces the number of molecules that need to go to wet lab.

Warning

Do not automate decisions off TxGemma output alone. Use it to prioritize human review, not to replace it. The model is confident about answers it should not be confident about, and the cost of being wrong on a promising molecule is higher than the cost of a chemist spending ten minutes reading a rationale.

Bottom line

Asking an LLM about drug binding is a skill. The good version of the question treats TxGemma as a chemistry-literate reasoning partner and asks for rationales and design suggestions. The bad version asks for numbers and gets nonsense. Once you internalize the shape of a good prompt, the model becomes a fast and useful triage layer on top of the physical and quantitative tools you already use.

Try TxGemma on SciRouter →

Ask an LLM About Drug Binding: The TxGemma Practical Guide

The general shape of a good TxGemma prompt

Asking about hERG liability

Bad prompt

Good prompt

Asking about BBB penetration

Bad prompt

Good prompt

Asking about off-target effects

Good prompt

Asking about synthesis

Good prompt

Composing the prompts into a triage pipeline

Bottom line

Frequently Asked Questions

Can an LLM reliably predict drug binding?

What should I ask TxGemma about hERG?

How do I ask about BBB penetration?

Can TxGemma predict off-target effects?

Does TxGemma know my target?

How should I combine TxGemma with Boltz-2?

What is the biggest mistake people make prompting TxGemma?

Related Tools

TxGemma — Google Therapeutics Reasoning LLM

More in the Therapeutics LLM Series

TxGemma Explained: Google's Therapeutics LLM for Drug Discovery

Drug Discovery LLM Comparison 2026: TxGemma vs BioMedLM vs PaLM 2

AI Reasoning About ADMET Properties: A TxGemma Tutorial

Try this yourself