There is a practical difference between “asking an LLM about drug binding” and “using an LLM as a binding predictor.” The first is useful. The second is a mistake. This guide is about the useful version: how to prompt TxGemma so that you get the kind of structured, reasoned answer a medicinal chemist would give you at a design review.
We will cover prompts for four of the questions that come up most often in early-stage drug discovery: hERG liability, blood-brain barrier penetration, off-target effects, and synthesis. For each one there is a good way to ask and a bad way to ask, and the difference is large.
The general shape of a good TxGemma prompt
Before we get to specific questions, here is the template that produces the most useful output:
- Give TxGemma the SMILES string of the molecule.
- Give it context: target name, assay, any relevant observations.
- Ask for a verdict (categorical, not numeric) and a rationale grounded in the structure.
- Ask what structural changes would most likely move the verdict in the direction you want.
This template works because it matches how TxGemma was instruction-tuned. The model has seen thousands of prompts in that shape and knows how to respond.
Asking about hERG liability
Cardiac hERG channel blockade is one of the most common off-target liabilities that kills drug candidates in preclinical testing. A lot of scaffolds have hERG problems, and a lot of remediation strategies are well known.
Bad prompt
What is the hERG IC50 of this molecule?Good prompt
SMILES: <your smiles here>
Does this scaffold have structural features associated with hERG
liability? Give a categorical verdict (low/moderate/high concern),
explain which functional groups or physicochemical properties are
driving the concern, and suggest two structural changes that would
most likely reduce the hERG risk.The bad prompt asks for a number. TxGemma will produce one, and it will be wrong. The good prompt asks for a rationale and a design strategy. TxGemma will return something like: “ moderate concern, driven by the basic amine and extended lipophilic core, and mitigation would come from reducing basicity or breaking coplanarity with an ortho substituent.” That answer is actionable.
Asking about BBB penetration
Central nervous system drug discovery lives and dies on the blood-brain barrier. Whether a molecule crosses the barrier is driven by a cluster of physicochemical properties — molecular weight, logP, H-bond donor count, topological polar surface area — and by active efflux via transporters like P-gp.
Bad prompt
Does this molecule cross the blood-brain barrier?Good prompt
SMILES: <your smiles here>
I need this compound to be CNS-penetrant. Review the physicochemical
properties that drive BBB permeability for this molecule (MW, logP, HBD,
HBA, TPSA) and comment on likely P-gp efflux liability based on
structural motifs. Suggest three structural changes that would most
likely improve CNS exposure without losing target engagement.The good prompt names the relevant properties. TxGemma then grounds its answer in those properties, tells you which are borderline, and proposes specific changes. You get a medicinal-chemistry briefing, not a yes/no answer that pretends to be certain.
Asking about off-target effects
Off-target promiscuity is what generates most of the selectivity work in late-stage optimization. Common off-targets include kinase panels, ion channels, nuclear receptors, and CYP isoforms. A scaffold that hits one target is likely to hit others if it has the right shape.
Good prompt
SMILES: <your smiles here>
Primary target: <target name>
Does this scaffold resemble known pharmacophores for any common
off-targets (kinases, GPCRs, ion channels, nuclear receptors)? If yes,
name the off-target families, explain the structural similarity, and
suggest whether a selectivity problem is likely. Do not give numeric
selectivity scores.Notice the explicit instruction not to give numeric selectivity scores. This is a useful prompt hygiene habit. If you do not forbid numeric output, TxGemma will sometimes produce confident numbers that look like calibrated predictions but are not. Forbidding them forces the model into the explanation mode it is actually good at.
Asking about synthesis
Retrosynthesis is the classic medicinal chemistry exercise: given a target molecule, propose reasonable disconnections backwards to commercially available starting materials. TxGemma was trained on retrosynthesis datasets in the TDC instruction mixture, so it has seen thousands of examples.
Good prompt
SMILES: <your smiles here>
Propose a three-step retrosynthesis for this molecule. For each step,
name the reaction type, identify the bond being disconnected, and
comment on the expected yield and any stereochemistry concerns. Do not
use reactions that require unusual transition metal catalysts.TxGemma will respond with something that reads like a senior graduate student at a board. Treat it as a first draft — you still need a real chemist to confirm that the proposed route is viable, and you still need retrosynthesis software for production planning. But as a thought partner it is fast and surprisingly literate.
Composing the prompts into a triage pipeline
In practice you will want to chain these prompts into a single triage pipeline. For each hit in your list you ask about hERG, BBB, off-targets, and synthesis, and you collect the answers into a profile that your chemists can review.
- Loop over your hit list.
- For each molecule, fire all four prompts in parallel.
- Aggregate the responses into a single profile object per molecule.
- Flag molecules where TxGemma is most enthusiastic.
- Route the top candidates to Boltz-2 for a binding calculation and to your ADMET predictor for numeric confirmation.
This gives you a fast, cheap first-pass triage that includes a reasoning trail you can show to chemists. It is not a substitute for experimental data, but it meaningfully reduces the number of molecules that need to go to wet lab.
Bottom line
Asking an LLM about drug binding is a skill. The good version of the question treats TxGemma as a chemistry-literate reasoning partner and asks for rationales and design suggestions. The bad version asks for numbers and gets nonsense. Once you internalize the shape of a good prompt, the model becomes a fast and useful triage layer on top of the physical and quantitative tools you already use.