What is virtual screening and how does it work?

Virtual screening is a computational approach that evaluates large libraries of molecules against a biological target to identify potential drug candidates without physical experimentation. It works by applying sequential computational filters: drug-likeness rules to remove non-viable molecules, molecular docking to predict binding poses and affinities, and ADMET prediction to assess safety and pharmacokinetics. Each filter narrows the library, typically reducing 1,000 or more starting compounds to 10 to 50 high-confidence candidates for experimental validation. Virtual screening can be structure-based (using the 3D protein structure for docking) or ligand-based (using known active molecules as templates).

What is the difference between HTS, SBVS, and LBVS?

HTS (High-Throughput Screening) is experimental: robots physically test 100,000 to 1,000,000 compounds against the target in microtiter plates. It is expensive (one dollar or more per compound) but unbiased. SBVS (Structure-Based Virtual Screening) uses the 3D structure of the target protein to computationally dock compounds and predict binding. It requires a protein structure but can discover novel scaffolds. LBVS (Ligand-Based Virtual Screening) uses properties and fingerprints of known active compounds to find similar molecules in libraries. It does not require a protein structure but is biased toward known chemical scaffolds. In practice, modern campaigns combine all three approaches.

How do I choose a compound library for virtual screening?

The choice depends on your program goals. For hit discovery, use a large diverse library (ZINC, Enamine REAL) with 100,000 to 10,000,000 molecules covering broad chemical space. For lead optimization, use a focused library of analogs around your lead compound (generated with tools like REINVENT4). For fragment screening, use a fragment library of 1,000 to 5,000 low-molecular-weight compounds. For natural products, use databases like COCONUT or the Dictionary of Natural Products. All libraries should be filtered for drug-likeness and synthetic accessibility before docking to avoid wasting computational resources on non-viable molecules.

What is consensus scoring in virtual screening?

Consensus scoring combines docking scores from multiple methods or scoring functions to improve hit rates. No single docking method or scoring function is consistently accurate across all target types, so using two or more methods and selecting compounds that score well across all of them (the consensus) reduces false positives. For example, running both DiffDock and AutoDock Vina and selecting compounds that rank in the top 10% by both methods yields higher experimental hit rates (typically 15 to 25%) compared to using either method alone (5 to 15%). The trade-off is that consensus scoring is more conservative and may miss some true actives that only one method detects.

How many compounds should I screen to find a viable hit?

The number depends on the target tractability and library diversity. For a well-validated target with a defined binding pocket, screening 1,000 diverse compounds through a rigorous multi-stage pipeline (property filter, docking, ADMET filter, consensus scoring) typically yields 5 to 20 high-confidence candidates, of which 2 to 5 will show experimental activity. For difficult targets (flat protein-protein interfaces, allosteric sites), you may need 10,000 or more compounds to find viable hits. The SciRouter API is optimized for screening campaigns of 1,000 to 10,000 compounds using concurrent API calls.

Do I need a GPU to run virtual screening?

Not with SciRouter. The API runs molecular docking (DiffDock, AutoDock Vina), ADMET prediction, and property calculation on cloud GPUs and CPUs. You submit SMILES strings and protein structures via API calls and receive results without managing any hardware. DiffDock inference runs on A100 GPUs on the SciRouter backend; property calculations and ADMET predictions run on CPUs. A full screening campaign of 1,000 compounds can be completed in approximately 10 to 30 minutes depending on your API tier and the complexity of the docking target.

Virtual Screening Tutorial: Screen 1,000 Compounds Against Your Target

What Is Virtual Screening?

Drug discovery begins with a target – a protein implicated in disease – and a question: which small molecules bind to it? The brute-force answer is high-throughput screening (HTS): physically testing 500,000 compounds in robotic assays at a cost of $500,000 to $2,000,000 per campaign. The computational answer is virtual screening: using algorithms to predict which compounds are most likely to bind, then testing only the top candidates experimentally.

Virtual screening reduces the experimental burden by 100 to 1,000-fold. Instead of testing 500,000 compounds, you computationally evaluate them and synthesize only the top 50 to 500 for experimental validation. Hit rates from virtual screening (5 to 20% of tested compounds show activity) dramatically outperform random screening (0.01 to 0.1%), because the computational filters concentrate your experimental resources on the most promising chemical matter.

There are three fundamental approaches to virtual screening, each with different requirements and strengths.

High-Throughput Screening (HTS)

HTS is the experimental gold standard: physical testing of large compound libraries against the target in miniaturized assays (384-well or 1536-well plates). It is unbiased – it can discover scaffolds that no computational method would predict – but expensive, slow (3 to 6 months per campaign), and limited by the physical compound collection. Most pharmaceutical companies maintain libraries of 1 to 3 million compounds, a tiny fraction of drug-like chemical space (estimated at 10 to the 60th power molecules).

Structure-Based Virtual Screening (SBVS)

SBVS uses the three-dimensional structure of the target protein to computationally dock candidate molecules into the binding site. Each molecule is placed in the binding pocket in multiple orientations and conformations, and a scoring function estimates the binding affinity. Compounds with favorable docking scores are predicted to be binders. SBVS requires a protein structure (from X-ray crystallography, cryo-EM, or computational prediction) but can discover entirely novel scaffolds that have no structural similarity to known actives.

Ligand-Based Virtual Screening (LBVS)

LBVS uses the properties of known active molecules (ligands) to find similar compounds in large libraries. If you know that compound X binds the target with 100 nM affinity, you search for molecules with similar fingerprints, shape, or pharmacophore features. LBVS does not require a protein structure and is computationally cheaper than docking, but it is biased toward scaffolds similar to known actives and may miss novel chemotypes.

Note

This tutorial uses a combined approach: LBVS (property filtering and similarity) as the first pass, followed by SBVS (molecular docking with DiffDock) for the filtered set, and ADMET prediction as the final filter. This multi-stage funnel maximizes hit quality while keeping computational costs manageable.

Building Your Compound Library

The starting point for any virtual screening campaign is a compound library. The quality and diversity of your library directly determine the quality of your hits. Here are the major sources.

Public Databases

ZINC20/ZINC22: Over 1.4 billion purchasable compounds with SMILES and vendor information. The "in-stock" subset (~10 million compounds) contains molecules available for immediate purchase from vendors like Enamine, ChemBridge, and MolPort. Ideal for hit discovery campaigns where you want to test purchased compounds rather than synthesizing new ones.
Enamine REAL: Over 6 billion make-on-demand compounds that can be synthesized in 2 to 4 weeks using validated one-step or two-step reactions from available building blocks. The largest accessible chemical space database, but compounds must be custom-synthesized after virtual screening identifies them as hits.
ChEMBL: Over 2.4 million compounds with measured bioactivity data. Useful for assembling focused libraries around specific target classes (for example, all compounds with measured activity against kinases).
PubChem: Over 110 million compounds from academic screening campaigns. Broader coverage than ChEMBL but with more variable data quality.

Preparing a 1,000-Compound Library

For this tutorial, we will build a focused library of 1,000 compounds targeting EGFR (Epidermal Growth Factor Receptor), a validated oncology target with approved drugs (erlotinib, gefitinib, osimertinib). We start from a ChEMBL extract of known EGFR-active scaffolds and generate diverse analogs.

Build a 1,000-compound focused library for EGFR screening

import os, requests, time

API_KEY = os.environ["SCIROUTER_API_KEY"]
BASE = "https://api.scirouter.ai/v1"
HEADERS = {"Authorization": f"Bearer {API_KEY}"}

# Seed compounds: known EGFR inhibitor scaffolds
seed_compounds = [
    # Erlotinib-like (quinazoline)
    "COCOc1cc2ncnc(Nc3ccc(F)c(Cl)c3)c2cc1OCOC",
    # Gefitinib-like (4-anilinoquinazoline)
    "COc1cc2ncnc(Nc3ccc(F)c(Cl)c3)c2cc1OC",
    # Lapatinib-like (pyridopyrimidine)
    "CS(=O)(=O)CCNCc1ccc(-c2ccc3ncnc(Nc4ccc(OCc5cccc(F)c5)c(Cl)c4)c3c2)o1",
]

# Generate diverse analogs from each seed
library = []
for seed in seed_compounds:
    job = requests.post(f"{BASE}/chemistry/generate", headers=HEADERS, json={
        "model": "reinvent4",
        "num_molecules": 350,
        "objectives": {
            "similarity": {
                "weight": 0.5,
                "reference_smiles": seed,
                "min_similarity": 0.2,
                "max_similarity": 0.7,
            },
            "drug_likeness": {"weight": 1.0, "method": "qed"},
            "synthetic_accessibility": {"weight": 0.6, "max_sa_score": 5.5},
        },
    }).json()

    while True:
        result = requests.get(
            f"{BASE}/chemistry/generate/{job['job_id']}", headers=HEADERS
        ).json()
        if result["status"] in ("completed", "failed"):
            break
        time.sleep(10)

    if result["status"] == "completed":
        library.extend(result["molecules"])

# Deduplicate by SMILES
seen = set()
unique_library = []
for mol in library:
    if mol["smiles"] not in seen:
        seen.add(mol["smiles"])
        unique_library.append(mol)

# Trim to exactly 1,000
library = unique_library[:1000]
print(f"Library size: {len(library)} unique compounds from "
      f"{len(seed_compounds)} seed scaffolds")

Target Preparation with Pocket Detection

Structure-based virtual screening requires a 3D protein structure with a defined binding site. For EGFR, multiple crystal structures are available in the PDB (for example, PDB 1M17 with erlotinib bound). If no experimental structure is available, you can predict one using ESMFold or AlphaFold2 via the SciRouter API.

Even with a known structure, binding pocket detection is valuable for identifying the exact coordinates and residues that define the docking region. SciRouter's pocket detection endpoint analyzes the protein surface geometry to identify druggable cavities and their properties.

Detect binding pockets on EGFR kinase domain

# EGFR kinase domain sequence (for structure prediction if needed)
egfr_sequence = (
    "FKKIKVLGSGAFGTVYKGLWIPEGEKVKIPVAIKELREATSPKANKEILDEAY"
    "VMASVDNPHVCRLLGICLTSTVQLITQLMPFGCLLDYVREHKDNIGSQYLLNW"
    "CVQIAKGMNYLEDRRLVHRDLAARNVLVKTPQHVKITDFGLAKLLGAEEKEYHAE"
    "GGKVPIKWMALESILHRIYTHQSDVWSYGVTVWELMTFGSKPYDGIPASEISSILE"
    "KGERLPQPPICTIDVYMIMVKCWMIDADSRPKFRELIIEFSKMARDPQRYLVIQGDERM"
)

# If no crystal structure, predict with ESMFold
fold_job = requests.post(f"{BASE}/proteins/fold", headers=HEADERS,
    json={"sequence": egfr_sequence, "model": "esmfold"}).json()

# Poll for completion
while True:
    fold_result = requests.get(
        f"{BASE}/proteins/fold/{fold_job['job_id']}", headers=HEADERS
    ).json()
    if fold_result["status"] in ("completed", "failed"):
        break
    time.sleep(5)

pdb_string = fold_result["pdb"]
print(f"Structure predicted: {len(egfr_sequence)} residues")
print(f"Average pLDDT: {fold_result['metrics']['plddt_mean']:.1f}")

# Detect binding pockets
pockets = requests.post(f"{BASE}/proteins/pocket-detection",
    headers=HEADERS, json={"pdb": pdb_string}).json()

print(f"\nDetected {len(pockets['pockets'])} pockets:")
for i, pocket in enumerate(pockets["pockets"]):
    print(f"  Pocket {i+1}: Volume={pocket['volume']:.0f} A^3, "
          f"Druggability={pocket['druggability_score']:.2f}, "
          f"Residues={len(pocket['residues'])}")

# Use the most druggable pocket for docking
best_pocket = max(pockets["pockets"], key=lambda p: p["druggability_score"])
print(f"\nSelected pocket: Volume={best_pocket['volume']:.0f} A^3, "
      f"Center=({best_pocket['center_x']:.1f}, "
      f"{best_pocket['center_y']:.1f}, "
      f"{best_pocket['center_z']:.1f})")

Running the Batch Screen

With the library prepared and the target structure ready, run the virtual screening pipeline. The strategy is a multi-stage funnel: first, filter by drug-like properties (fast, CPU-based); then dock the filtered set against the target (slower, GPU-based); finally, apply ADMET filters to the top docking hits.

Stage 1: Property Pre-Filter

Remove compounds that violate basic drug-likeness criteria before investing computational resources in docking. This step is fast (sub-second per compound) and typically eliminates 10 to 30% of the library.

Stage 1: Drug-likeness pre-filter

# Stage 1: Property pre-filter
filtered_library = []

for mol in library:
    props = requests.post(f"{BASE}/chemistry/properties",
        headers=HEADERS, json={"smiles": mol["smiles"]}).json()

    # Lipinski-like filters (relaxed for kinase inhibitors)
    passes = True
    if props["molecular_weight"] > 600:
        passes = False
    if props["logp"] > 6.0 or props["logp"] < -1.0:
        passes = False
    if props["h_bond_donors"] > 5:
        passes = False
    if props["h_bond_acceptors"] > 10:
        passes = False
    if props["rotatable_bonds"] > 12:
        passes = False

    if passes:
        mol["properties"] = props
        filtered_library.append(mol)

print(f"Stage 1: {len(filtered_library)}/{len(library)} pass "
      f"property filters ({len(filtered_library)/len(library)*100:.0f}%)")

Stage 2: Molecular Docking with DiffDock

Dock the filtered compounds against the EGFR binding pocket using DiffDock. DiffDock is a diffusion-based molecular docking method that predicts binding poses and confidence scores without requiring a predefined binding box. For large screening campaigns, submit molecules in parallel using concurrent API calls.

Stage 2: Batch molecular docking with DiffDock

from concurrent.futures import ThreadPoolExecutor, as_completed

def dock_molecule(mol):
    """Dock a single molecule against EGFR and return the result."""
    try:
        dock_result = requests.post(f"{BASE}/docking/diffdock",
            headers=HEADERS, json={
                "protein_pdb": pdb_string,
                "ligand_smiles": mol["smiles"],
                "num_poses": 5,
            }, timeout=120).json()

        # Poll for async result
        job_id = dock_result.get("job_id")
        if job_id:
            for _ in range(30):
                status = requests.get(
                    f"{BASE}/docking/diffdock/{job_id}",
                    headers=HEADERS
                ).json()
                if status["status"] == "completed":
                    return {
                        "smiles": mol["smiles"],
                        "properties": mol["properties"],
                        "top_confidence": status["poses"][0]["confidence"],
                        "top_pose_score": status["poses"][0]["score"],
                        "num_poses": len(status["poses"]),
                    }
                if status["status"] == "failed":
                    return None
                time.sleep(5)
        return None
    except Exception:
        return None

# Run docking in parallel (10 concurrent workers)
docking_results = []
with ThreadPoolExecutor(max_workers=10) as executor:
    futures = {executor.submit(dock_molecule, mol): mol
               for mol in filtered_library}

    completed = 0
    for future in as_completed(futures):
        completed += 1
        result = future.result()
        if result:
            docking_results.append(result)
        if completed % 50 == 0:
            print(f"Docked {completed}/{len(filtered_library)} "
                  f"({len(docking_results)} successful)")

# Sort by docking confidence
docking_results.sort(key=lambda x: x["top_confidence"], reverse=True)
print(f"\nStage 2: {len(docking_results)} docking results")
print(f"Top 10 docking scores:")
for i, r in enumerate(docking_results[:10]):
    print(f"  {i+1}. Confidence: {r['top_confidence']:.3f} | "
          f"Score: {r['top_pose_score']:.2f} | "
          f"{r['smiles'][:50]}")

Stage 3: Consensus Scoring

To improve hit rates, run a second docking method (AutoDock Vina) on the top 100 DiffDock hits and select compounds that score well by both methods. Consensus scoring reduces false positives because different methods have different failure modes – a compound that happens to score well due to a scoring function artifact in one method is unlikely to score well in the other.

Stage 3: Consensus scoring with AutoDock Vina

# Take top 100 from DiffDock for consensus scoring
top_100 = docking_results[:100]

def vina_dock(mol):
    """Dock with AutoDock Vina for consensus scoring."""
    try:
        result = requests.post(f"{BASE}/docking/autodock-vina",
            headers=HEADERS, json={
                "protein_pdb": pdb_string,
                "ligand_smiles": mol["smiles"],
                "center_x": best_pocket["center_x"],
                "center_y": best_pocket["center_y"],
                "center_z": best_pocket["center_z"],
                "box_size": 25.0,
                "exhaustiveness": 16,
            }, timeout=120).json()

        job_id = result.get("job_id")
        if job_id:
            for _ in range(20):
                status = requests.get(
                    f"{BASE}/docking/autodock-vina/{job_id}",
                    headers=HEADERS
                ).json()
                if status["status"] == "completed":
                    mol["vina_score"] = status["poses"][0]["affinity_kcal"]
                    return mol
                if status["status"] == "failed":
                    return None
                time.sleep(5)
        return None
    except Exception:
        return None

# Run Vina docking in parallel
consensus_results = []
with ThreadPoolExecutor(max_workers=10) as executor:
    futures = {executor.submit(vina_dock, mol): mol for mol in top_100}
    for future in as_completed(futures):
        result = future.result()
        if result and "vina_score" in result:
            consensus_results.append(result)

print(f"Consensus scoring: {len(consensus_results)} compounds docked "
      f"with both DiffDock and Vina")

# Rank by consensus: normalize both scores and average
if consensus_results:
    max_conf = max(r["top_confidence"] for r in consensus_results)
    min_vina = min(r["vina_score"] for r in consensus_results)
    max_vina = max(r["vina_score"] for r in consensus_results)

    for r in consensus_results:
        norm_diffdock = r["top_confidence"] / max_conf
        # Vina scores are negative (more negative = better)
        norm_vina = (r["vina_score"] - max_vina) / (min_vina - max_vina)
        r["consensus_score"] = (norm_diffdock + norm_vina) / 2

    consensus_results.sort(key=lambda x: x["consensus_score"], reverse=True)

    print("\nTop 20 by consensus score:")
    for i, r in enumerate(consensus_results[:20]):
        print(f"  {i+1}. Consensus: {r['consensus_score']:.3f} | "
              f"DiffDock: {r['top_confidence']:.3f} | "
              f"Vina: {r['vina_score']:.1f} kcal/mol")
        print(f"     {r['smiles'][:60]}")

Post-Screening ADMET Filter

The final stage applies ADMET prediction to the top consensus hits. A compound with excellent docking scores but poor pharmacokinetics or safety liabilities is not a viable drug candidate. ADMET filtering after docking (rather than before) ensures that you do not prematurely eliminate compounds that might have acceptable ADMET profiles despite borderline property values.

Stage 4: ADMET filtering of top docking hits

# ADMET filter on top 50 consensus hits
top_50 = consensus_results[:50]
final_candidates = []

for mol in top_50:
    admet = requests.post(f"{BASE}/chemistry/admet",
        headers=HEADERS, json={"smiles": mol["smiles"]}).json()
    synth = requests.post(f"{BASE}/chemistry/synthesis-check",
        headers=HEADERS, json={"smiles": mol["smiles"]}).json()

    # ADMET hard filters
    passes = True
    if admet["herg_inhibition"] == "high":
        passes = False
    if admet["hepatotoxicity"] == "high":
        passes = False
    if synth["sa_score"] > 6.0:
        passes = False

    if passes:
        mol["admet"] = admet
        mol["sa_score"] = synth["sa_score"]
        final_candidates.append(mol)

print(f"Stage 4: {len(final_candidates)}/{len(top_50)} pass ADMET filters")
print(f"\n=== Final Candidates for Experimental Validation ===")
for i, c in enumerate(final_candidates[:15]):
    print(f"\n{i+1}. {c['smiles']}")
    print(f"   Consensus: {c['consensus_score']:.3f} | "
          f"Vina: {c['vina_score']:.1f} kcal/mol")
    print(f"   MW: {c['properties']['molecular_weight']:.0f} | "
          f"LogP: {c['properties']['logp']:.1f} | "
          f"SA: {c['sa_score']:.1f}")
    print(f"   hERG: {c['admet']['herg_inhibition']} | "
          f"Hepatotox: {c['admet']['hepatotoxicity']} | "
          f"Oral F: {c['admet']['oral_bioavailability']}")

Tip

Save the full screening results (including compounds that failed at each stage) for SAR analysis. The pattern of which compounds pass and fail each filter reveals structure-property relationships that inform the next screening iteration. Compounds that dock well but fail ADMET are particularly informative – they tell you which binding scaffolds need ADMET optimization.

Case Study: EGFR Kinase Virtual Screen

Let us walk through the expected results for an EGFR kinase virtual screening campaign using the pipeline described above.

Starting library: 1,000 compounds generated from three seed scaffolds (quinazoline, 4-anilinoquinazoline, pyridopyrimidine) with Tanimoto similarity 0.2 to 0.7 to the seeds. This produces a focused but diverse library spanning the known EGFR inhibitor chemical space plus novel variations.

Stage 1 (property filter): 847 of 1,000 compounds pass Lipinski-relaxed filters (15% attrition). The failures are mostly high-molecular-weight analogs from the lapatinib seed with excessive rotatable bonds or LogP values.

Stage 2 (DiffDock): 847 compounds are docked against the EGFR kinase domain (PDB 1M17 or predicted structure). The top 100 by confidence score include compounds from all three seed scaffolds, with the quinazoline derivatives showing the highest average confidence (they most closely resemble erlotinib, which crystallizes well in the EGFR pocket).

Stage 3 (consensus scoring): The top 100 DiffDock hits are re-docked with AutoDock Vina. Of these, 73 receive Vina scores better than -8.0 kcal/mol (a typical threshold for kinase inhibitors). The top 20 by consensus score all have DiffDock confidence above 0.7 and Vina affinity below -9.5 kcal/mol.

Stage 4 (ADMET filter): Of the top 50 consensus hits, 31 pass all ADMET hard filters (no high hERG risk, no high hepatotoxicity, SA score below 6.0). The most common failure mode is hERG liability (11 compounds) driven by high lipophilicity in the lapatinib-derived analogs. The final 31 candidates are ready for procurement or synthesis and experimental testing in an EGFR kinase assay.

Expected hit rate: Based on benchmark studies, 15 to 25% of the final candidates (5 to 8 compounds) would be expected to show measurable EGFR inhibition (IC50 below 10 micromolar) in a biochemical kinase assay. Of these, 1 to 3 would show sub-micromolar potency suitable for lead optimization.

Optimizing Your Screening Campaign

The pipeline described above is a starting point. Here are practical optimizations for different scenarios.

Increase throughput with batch endpoints: SciRouter's property and ADMET endpoints accept batch requests of up to 1,000 SMILES per call. Use batch mode instead of individual requests to reduce API overhead. A batch property calculation for 1,000 compounds completes in under 30 seconds.
Use pocket-focused docking for speed: If you know the binding pocket location (from a co-crystal structure or pocket detection), specify the pocket center and box size in the Vina endpoint. This reduces the search space and speeds up docking by 3 to 5-fold compared to blind docking.
Iterative enrichment: After the first screen, analyze the SAR of the top hits. If a specific scaffold or substitution pattern is over-represented in the hits, generate 500 focused analogs of that scaffold and run a second screening round. This iterative enrichment narrows the chemical space toward the most productive region.
Fragment-based screening: For novel targets without known actives, start with a fragment library (MW 150 to 300, 1,000 to 5,000 fragments). Fragments have simpler binding modes and higher hit rates (typically 3 to 5% in experimental screens). Dock fragments, identify the top binders, and then grow them into drug-like molecules using the molecule generator.
Selectivity counter-screens: After identifying hits for your target, dock the same compounds against known off-targets (related family members, hERG channel structure). Select compounds that dock well to the target but poorly to off-targets. This builds selectivity into the screening campaign from the start.

Putting It All Together

Virtual screening is the computational foundation of modern drug discovery. The multi-stage funnel described in this tutorial – property filter, molecular docking with DiffDock, consensus scoring with AutoDock Vina, and ADMET post-filtering – transforms a library of 1,000 diverse compounds into a shortlist of 15 to 30 high-confidence candidates for experimental validation.

SciRouter's API provides every component of this pipeline through a unified interface: DiffDock and AutoDock Vina for molecular docking, pocket detection for binding site identification, and ADMET-AI for safety and pharmacokinetics prediction. The entire campaign – from library generation to final candidate selection – can be run programmatically in a single Python script, completing in 10 to 30 minutes depending on library size and API tier.

The traditional alternative – setting up DiffDock on a local GPU, configuring AutoDock Vina, running RDKit for property calculations, and integrating ADMET models – takes days of infrastructure setup before you can run your first screen. With SciRouter, the first API call can go out in under a minute. The infrastructure is the API; the science is your focus.

Virtual Screening Tutorial: Screen 1,000 Compounds Against Your Target

What Is Virtual Screening?

High-Throughput Screening (HTS)

Structure-Based Virtual Screening (SBVS)

Ligand-Based Virtual Screening (LBVS)

Building Your Compound Library

Public Databases

Preparing a 1,000-Compound Library

Target Preparation with Pocket Detection

Running the Batch Screen

Stage 1: Property Pre-Filter

Stage 2: Molecular Docking with DiffDock

Stage 3: Consensus Scoring

Post-Screening ADMET Filter

Case Study: EGFR Kinase Virtual Screen

Optimizing Your Screening Campaign

Putting It All Together

Frequently Asked Questions

What is virtual screening and how does it work?

What is the difference between HTS, SBVS, and LBVS?

How do I choose a compound library for virtual screening?

What is consensus scoring in virtual screening?

How many compounds should I screen to find a viable hit?

Do I need a GPU to run virtual screening?

Related Tools

DiffDock — AI Molecular Docking

AutoDock Vina — Molecular Docking

Pocket Detection — fpocket

ADMET-AI v2 — Comprehensive Drug Safety Profiling

Try It Free

Binding Pocket Finder

More in the Drug Discovery Tools Series

Molecular Docking Online: How to Dock Any Molecule to Any Protein in Your Browser

Latent Space Biology: How AI Imagines New Molecules and Proteins

Can This Molecule Be Made? How to Check Synthetic Accessibility Before You Order Synthesis

Try this yourself