Structure-Based Drug DesignStructure-Based Drug Design

Structure-Based Drug Discovery AI Playbook 2026

The complete AI playbook for structure-based drug discovery: target → pocket → generation → scoring → ranking.

SciRouter Team
April 11, 2026
14 min read

Structure-based drug discovery in 2026 looks nothing like it did five years ago. Where teams used to wait on crystal structures, negotiate expensive molecular docking runs, and enumerate fragments by hand, they now chain AI-predicted structures into pocket-aware generative models into chemistry-literate reasoning LLMs. The result is a pipeline that runs in hours instead of months and produces better starting points.

This post is the playbook. We will walk the full chain from target sequence to ranked candidates, name the tools that matter at each step, and call out the failure modes that you need to guard against. At the end we will wire the whole thing together through DiffSBDD and the SciRouter drug discovery lab.

Note
This playbook is for the in silico portion of drug discovery. Every output is a hypothesis that still needs experimental validation, medicinal chemistry iteration, and the usual long road to clinical assets.

Stage 1: Target to structure

Everything starts with a protein structure. If you have a co-crystal structure from the PDB, use it — nothing beats real ground truth. If you do not, predict the structure from sequence with a modern structural model.

  • AlphaFold for high-quality single-chain structural predictions.
  • Boltz-2 for structural prediction plus protein-ligand complex prediction, which also gives you a pocket definition.
  • ESMFold for very fast single-sequence predictions when speed matters more than absolute accuracy.

Pick based on the question you are asking. For pocket-conditioned generation you need a model that either produces pocket annotations natively (Boltz-2) or lets you run a pocket-detection tool on top (AlphaFold output into fpocket, for example).

Stage 2: Structure to pocket

A predicted structure alone is not enough. You need to know which cavity you are designing against. The active site, an allosteric pocket, or a protein-protein interface each require different pocket definitions.

  • Run a pocket detection tool or use Boltz-2's built-in pocket output.
  • Inspect the top pockets visually. Filter by size, druggability, and biological relevance.
  • Decide which pocket to target before committing to generation. Generating against the wrong pocket produces useless candidates.

When pocket prediction fails

For disordered regions, cryptic pockets that form only in complex, and proteins with novel folds, pocket prediction is less reliable. Expert review matters more in those cases, and it is worth running multiple pocket-detection tools and comparing.

Stage 3: Pocket to binder

With a pocket in hand, the generative step produces candidate molecules. In the AI playbook, this step is usually a 3D diffusion model conditioned on the pocket.

  • DiffSBDD for pocket-aware generation with equivariant diffusion.
  • TargetDiff as an alternative with different architectural choices.
  • REINVENT4 as a reinforcement-learning alternative when you need explicit property optimization.

The pattern we recommend is to run a diffusion generator first for pocket-aware candidates, then optionally pipe the best candidates into an RL optimizer for property tuning. For more on this pattern, see Diffusion vs RL for drug design.

Stage 4: Binder to filtered shortlist

Raw generator output contains a lot of noise. The filtering step removes molecules that would never make it past developability checks.

  • Drug-likeness filters. QED, logP bounds, Lipinski, Veber.
  • Synthetic accessibility. A SAS score or a retrosynthesis check. Molecules that cannot be made are not leads.
  • PAINS and reactive motifs. Remove known problem substructures.
  • ADMET panel. Predicted solubility, permeability, hERG, CYP, and toxicity from a calibrated predictor.

Do this filtering before you spend GPU time on the next stages. Every molecule that survives should be a real candidate.

Stage 5: Shortlist to ranked candidates

After filtering you have a shortlist, but they are not ranked. Ranking combines structural binding prediction with chemistry-aware reasoning.

  • Redock with a traditional program.AutoDock Vina, Glide, or equivalent. This validates that the generator's pose is plausible.
  • Binding-affinity predictor. An ML model trained to predict binding score from a pose. Boltz-2 can also produce affinity estimates.
  • Chemistry LLM rationale. Call TxGemma for a short written rationale on each top candidate. This gives you a review surface that is richer than raw numbers.

The rationale step is the quiet game-changer. Before LLMs, ranking was a list of scores and a chemist had to supply the context. Now the context is generated with the ranking, and chemists spend their time reviewing rather than summarizing.

Stage 6: Human review

The playbook ends where drug discovery always has: with a chemist reading the candidates and deciding what to do next. The AI pipeline gets you to that point faster and with more context, but it does not make the decision.

  • Read the top 10-20 candidates with their rationales.
  • Flag concerns: unusual chemistry, synthesis risk, missing functional groups, off-target risk.
  • Select the shortlist that goes to synthesis and assay.
  • Feed the wet-lab results back into the pipeline to bias the next round of generation.

Failure modes to guard against

Garbage pocket

If the pocket is wrong, everything downstream is wrong. Always inspect the pocket before generating. When in doubt, run multiple pocket detectors and compare.

Reward hacking in the optimizer

If you add a REINVENT stage, watch out for molecules that maximize the reward in ways a chemist would reject. Multi-objective rewards and chemist review catch most of these.

Over-reliance on any single score

Docking scores, QED, and LLM confidence are all imperfect. A candidate that looks good on one metric and bad on another is an interesting candidate, not a failure.

Skipping validation

The pipeline produces hypotheses. Assays produce answers. Nothing matters until you test the molecules.

Warning
No AI pipeline replaces the wet lab. The playbook gives you better starting points and more of them, faster. Turning those starting points into drugs still requires experimental validation, medicinal chemistry iteration, and clinical trials.

Putting it together on SciRouter

Every tool in the playbook is exposed as a managed endpoint on SciRouter. You can run the whole pipeline through a single API key — Boltz-2 for structure and pocket, DiffSBDD for generation, the chemistry panel for filtering, and TxGemma for rationale. The drug discovery lab wraps all of this into a UI if you do not want to write code. For a hands-on walkthrough, see From Protein Pocket to Lead Compound.

Bottom line

Structure-based drug discovery used to mean expensive crystal structures, slow docking runs, and months of manual fragment growing. The 2026 playbook replaces every step before experimental validation with AI tools that chain cleanly through a single gateway. The wet lab still matters. The iteration loops still matter. What changes is the speed and quality of the starting point — and in drug discovery, a better starting point is half the battle.

Try DiffSBDD on SciRouter → or open the drug discovery lab →

Frequently Asked Questions

What does 'structure-based' mean in this context?

Structure-based drug discovery starts with the 3D structure of the target protein — either experimentally determined or predicted — and uses that structure to guide molecule design. The opposite approach is ligand-based, which uses known active molecules to find new actives by similarity. Structure-based approaches are more principled but historically have required experimental structures. AI has changed that.

Do I need a crystal structure?

No, not anymore. Models like Boltz-2 and AlphaFold can predict protein structures with accuracy approaching experimental resolution for many targets. You should still use a crystal structure when one is available — it is the ground truth — but you are no longer blocked when it is not.

How accurate are AI-predicted pockets?

For well-folded proteins with structural analogs in the training data, the pocket geometry from Boltz-2 or AlphaFold is usually good enough to drive generative design. For disordered regions, novel folds, or pockets that form only in the presence of a ligand, predictions are less reliable and expert review matters more.

What is the role of the LLM in the playbook?

The LLM is the reasoning layer. It does not replace the structural or generative tools — it sits on top of them and helps interpret the results, write rationales for candidates, triage based on chemistry knowledge, and answer ADMET questions that a physical model does not cover. TxGemma is the specific LLM this playbook recommends.

How long does this pipeline take to run?

On SciRouter, a single end-to-end run — from target sequence to ranked candidate shortlist — typically takes on the order of minutes to a few hours, depending on the length of the sequence, the number of candidates generated, and GPU availability. The slow steps are structural prediction and generative sampling. The other steps are sub-second.

Is this a replacement for experimental drug discovery?

No. It is a replacement for the earliest virtual steps in drug discovery. Wet-lab assays, medicinal chemistry iteration, in vivo studies, toxicology, and clinical trials all remain irreplaceable. The playbook makes the in silico portion dramatically faster, higher quality, and more accessible.

Where do I start?

Start with one target and run the full pipeline end to end using SciRouter's hosted tools. Get a feel for the inputs, outputs, and where you need to make decisions. Then integrate into your own workflow. The drug discovery lab dashboard in SciRouter is a good starting interface if you do not want to code against the API directly.

Try this yourself

500 free credits. No credit card required.