What Is pLDDT?
When a protein structure prediction model like ESMFold or AlphaFold2 outputs a 3D structure, it also tells you how confident it is about each part of that structure. This confidence metric is called pLDDT – the predicted Local Distance Difference Test.
pLDDT is a per-residue score ranging from 0 to 100. It estimates how accurately the model has placed each amino acid in three-dimensional space. A score of 95 means the model is highly confident that the residue's position is correct; a score of 30 means it is essentially guessing. Understanding pLDDT is not optional – it is the single most important factor in deciding whether you can trust a predicted structure.
The name comes from the LDDT-Cα metric, which was originally developed to evaluate how well a predicted structure matches an experimentally determined reference. Structure prediction models adapted this into a self-assessment: the model predicts its own LDDT score for each residue, hence the "p" for predicted. This clever approach means you get reliability estimates without needing a reference structure.
How pLDDT Is Calculated
The underlying LDDT metric works by examining local distance relationships. For each residue, it looks at all atoms within a 15-angstrom radius and checks whether the pairwise distances in the predicted structure match those in a reference structure. Distances that fall within defined tolerance thresholds (0.5, 1, 2, and 4 angstroms) contribute to the score.
In the predicted version (pLDDT), the model estimates this score without a reference. During training, the model learns to predict its own accuracy by comparing its outputs to known experimental structures. The result is a well-calibrated confidence estimate – when a model says pLDDT = 90, the actual LDDT-Cα against experimental data is typically close to 0.90. This calibration is what makes pLDDT genuinely useful rather than just a vague quality indicator.
The pLDDT Color Scale
The structural biology community has adopted a standard color scheme for visualizing pLDDT scores. When you view a predicted structure in PyMOL, ChimeraX, Mol*, or the AlphaFold database, the coloring follows this convention:
- Dark blue (pLDDT > 90) – Very high confidence. The backbone conformation is accurate and side-chain rotamers are likely correct. These regions are suitable for detailed structural analysis, docking studies, and mutation effect predictions.
- Cyan / light blue (70 < pLDDT ≤ 90) – High confidence. The backbone trace is reliable, but side-chain positions may have some uncertainty. These regions are trustworthy for most applications.
- Orange (50 < pLDDT ≤ 70) – Low confidence. The general fold may be approximately correct, but specific atom positions are uncertain. Treat these regions with caution – they often correspond to flexible loops or regions with limited evolutionary information.
- Red (pLDDT ≤ 50) – Very low confidence. The predicted structure in these regions should not be interpreted as meaningful. These typically correspond to intrinsically disordered regions (IDRs), unstructured termini, or regions where the model lacks sufficient training data.
This color scale is stored in the B-factor column of PDB files generated by prediction models. Visualization software can apply the coloring automatically by selecting "color by B-factor" with the appropriate spectrum.
What Each pLDDT Range Means in Practice
Very High Confidence (> 90): Trust the Details
Regions with pLDDT above 90 are predicted with near-experimental accuracy. Multiple benchmarks have shown that these regions typically have Cα RMSD below 1 angstrom compared to experimental structures. You can confidently use these regions for:
- Molecular docking and virtual screening
- Active site analysis and substrate binding predictions
- Mutation effect analysis (how point mutations alter structure)
- Homology-based functional annotation
High Confidence (70–90): Trust the Backbone
The backbone fold is correct, but individual side-chain positions have meaningful uncertainty. This is common for surface-exposed residues where multiple rotamer states are energetically similar. These regions are suitable for:
- Overall fold classification and topology analysis
- Identifying secondary structure elements (α-helices, β-sheets)
- Protein-protein interface identification (at the domain level)
- Homology modeling template selection
Low Confidence (50–70): Interpret with Caution
These regions often correspond to flexible loops connecting secondary structure elements, or to regions where the model has limited evolutionary signal. The general backbone path may be approximately correct, but specific coordinates are unreliable. Common causes include limited homologous sequences in training data, genuine conformational flexibility, or crystal contacts that stabilize a loop in experimental structures but are absent in isolated prediction.
Very Low Confidence (< 50): Likely Disordered
Regions below pLDDT 50 almost always indicate intrinsically disordered regions – stretches of protein that do not adopt a stable three-dimensional structure. This is not a failure of the prediction model; the model is correctly recognizing that these regions are natively unstructured. Approximately 30–40% of the human proteome contains disordered regions, so encountering red regions is common and expected.
Using pLDDT in Research
Identifying Disordered Regions
One of the most impactful applications of pLDDT is rapid disorder prediction. Rather than running a separate disorder predictor, you can fold your protein with ESMFold and extract regions where pLDDT drops below 50. This is particularly useful when studying multi-domain proteins where you need to distinguish structured domains from disordered linkers.
Validating Predictions Before Downstream Analysis
Before using a predicted structure for docking, molecular dynamics, or any quantitative analysis, check the pLDDT in the regions that matter for your question. If you want to dock a ligand into a binding pocket, every residue lining that pocket should have pLDDT above 70 (ideally above 90). If key binding residues fall in low-confidence regions, the docking results will be unreliable regardless of how sophisticated the docking algorithm is.
Comparing Models
When you have predictions from multiple tools – say ESMFold for speed and AlphaFold2 for accuracy – pLDDT provides a consistent basis for comparison. If both models agree on high confidence for a region, you can be especially confident. If they disagree, the region warrants closer inspection. For a detailed comparison of these models, see our guide to ESMFold.
Extracting pLDDT Scores from the SciRouter API
When you fold a protein using SciRouter's API, the response includes per-residue pLDDT scores alongside the predicted structure. Here is a complete example that folds a protein and analyzes its confidence profile:
import requests
API_KEY = "sk-sci-your-api-key"
BASE = "https://api.scirouter.ai/v1"
# Fold a short protein sequence
sequence = "MKWVTFISLLFLFSSAYSRGVFRRDAHKSEVAHRFKDLGE"
response = requests.post(
f"{BASE}/proteins/fold",
headers={"Authorization": f"Bearer {API_KEY}"},
json={
"sequence": sequence,
"model": "esmfold"
}
)
result = response.json()
plddt_scores = result["plddt_scores"]
avg_plddt = result["average_plddt"]
print(f"Average pLDDT: {avg_plddt:.1f}")
print(f"Residues predicted: {len(plddt_scores)}")
# Classify regions by confidence
very_high = sum(1 for s in plddt_scores if s > 90)
high = sum(1 for s in plddt_scores if 70 < s <= 90)
low = sum(1 for s in plddt_scores if 50 < s <= 70)
very_low = sum(1 for s in plddt_scores if s <= 50)
print(f"Very high confidence (>90): {very_high} residues")
print(f"High confidence (70-90): {high} residues")
print(f"Low confidence (50-70): {low} residues")
print(f"Very low confidence (<50): {very_low} residues")Finding Disordered Regions Programmatically
You can use the pLDDT array to identify contiguous stretches of disorder automatically:
def find_disordered_regions(plddt_scores, threshold=50, min_length=5):
"""Find contiguous regions where pLDDT is below threshold."""
regions = []
start = None
for i, score in enumerate(plddt_scores):
if score < threshold:
if start is None:
start = i
else:
if start is not None and (i - start) >= min_length:
regions.append((start + 1, i)) # 1-indexed
start = None
# Handle region at the end of the sequence
if start is not None and (len(plddt_scores) - start) >= min_length:
regions.append((start + 1, len(plddt_scores)))
return regions
# Using the pLDDT scores from the fold result
disordered = find_disordered_regions(plddt_scores)
for start, end in disordered:
avg = sum(plddt_scores[start-1:end]) / (end - start + 1)
print(f"Disordered region: residues {start}-{end} "
f"(avg pLDDT: {avg:.1f})")pLDDT Across Different Models
Different prediction models produce different pLDDT distributions for the same protein. This is important to understand when comparing results:
- AlphaFold2 – Tends to produce the highest pLDDT scores because it uses multiple sequence alignments (MSAs) that provide rich evolutionary context. Average pLDDT across the human proteome is approximately 80.
- ESMFold – Operates on single sequences without MSAs, resulting in slightly lower pLDDT scores (typically 5–15 points lower than AlphaFold2 for equivalent regions). The trade-off is dramatically faster inference – seconds rather than minutes.
- Boltz-2 – Designed for protein complexes and multi-chain predictions. pLDDT scores at interfaces between chains tend to be lower than for isolated chains, reflecting genuine uncertainty about binding geometry.
- OmegaFold – Similar single-sequence approach to ESMFold with comparable pLDDT distributions. Useful as an independent validation when ESMFold results are borderline.
For a comprehensive comparison of these tools, see our ESMFold vs AlphaFold comparison.
Common Mistakes When Interpreting pLDDT
Even experienced researchers misinterpret pLDDT scores. Here are the most common pitfalls:
- Treating low pLDDT as a model failure – Low scores in disordered regions are correct behavior. The model is telling you something real about the protein's biology.
- Comparing pLDDT across different models without calibration – A pLDDT of 75 from ESMFold is not equivalent to 75 from AlphaFold2. Always compare within the same model or account for systematic differences.
- Ignoring domain boundaries – pLDDT measures local confidence. Two domains can each have pLDDT above 90 while their relative orientation is completely wrong. Check the PAE matrix for inter-domain confidence.
- Using low-confidence regions for docking – If your target binding site includes residues below pLDDT 70, docking results are unreliable. Either use an experimental structure for that region or acknowledge the limitation.
- Averaging pLDDT as a single quality score – A protein with average pLDDT of 75 could have a well-folded core at 95 and disordered tails at 30. The average hides critical regional variation. Always examine the per-residue profile.
Next Steps
pLDDT is your most important guide when working with predicted protein structures. By understanding what the scores mean and applying them systematically, you can separate trustworthy structural insights from unreliable noise.
To put this into practice, fold your protein of interest with ESMFold and examine the per-residue pLDDT profile. Use the code examples above to identify disordered regions and validate that the regions relevant to your research question fall within the confidence range you need.
Ready to start folding? Sign up for a free SciRouter API key and predict your first protein structure in seconds.