Why Predict Protein Stability?
Every protein engineering project eventually hits the same bottleneck: you have a list of candidate mutations and need to know which ones will make your protein more stable, which will be neutral, and which will cause it to misfold. Traditionally, this meant weeks of cloning, expression, and thermal shift assays to test each variant.
Computational stability prediction flips this workflow. Instead of testing mutations blind, you score them in silico first, rank by predicted effect, and only take the most promising candidates to the lab. A typical directed evolution campaign might start with 1,000 candidate mutations and use stability prediction to narrow to 20–50 variants for experimental validation – a 20x reduction in wet lab effort.
In this guide, we walk through the fundamentals of protein stability prediction, introduce ThermoMPNN as a practical tool, and demonstrate hands-on scoring of 10 real mutations using the SciRouter API.
DDG: The Language of Protein Stability
Protein stability is quantified by the Gibbs free energy of folding (DG). A protein folds spontaneously when DG is negative – meaning the folded state is thermodynamically favored over the unfolded state. When we introduce a mutation, the change in folding free energy is called DDG (delta delta G):
- DDG < 0 kcal/mol: Stabilizing mutation – the mutant folds more tightly than wild type
- DDG = 0 kcal/mol: Neutral mutation – no effect on stability
- DDG > 0 kcal/mol: Destabilizing mutation – the mutant is less stable than wild type
Most random mutations are destabilizing. In a typical protein, roughly 70% of single-point mutations have a positive DDG (destabilizing), 20% are neutral, and only 10% are stabilizing. This is why computational pre-screening is so valuable – it helps you find the rare stabilizing mutations without testing all possibilities experimentally.
ThermoMPNN: Structure-Based Stability Prediction
ThermoMPNN is a deep learning model for predicting mutation stability effects. It builds on the ProteinMPNN architecture – the same model used for protein sequence design – and fine-tunes it on experimental DDG data from databases like ProTherm and Megascale.
The model takes a protein structure (PDB format) and a mutation specification as input, then predicts the DDG in kcal/mol. Because it operates on 3D structure rather than sequence alone, it captures the geometric context of each mutation – whether the residue is buried or exposed, what contacts it makes, and how the local backbone geometry constrains substitutions.
How ThermoMPNN Works
- Input: Protein 3D structure (PDB) + mutation(s) specified as position and substitution
- Encoding: Structure is encoded as a graph with residues as nodes and spatial contacts as edges
- Message passing: Graph neural network propagates information between neighboring residues
- Prediction: A regression head outputs the predicted DDG value for the mutation
- Speed: Under 1 second per mutation on CPU, enabling rapid screening of large mutation libraries
Hands-On: Scoring 10 Mutations on T4 Lysozyme
T4 lysozyme (PDB: 2LZM) is one of the most extensively studied proteins for stability mutations, with hundreds of experimentally measured DDG values available in the literature. This makes it an ideal test case for validating computational predictions against known results.
We will score 10 well-characterized mutations and compare our predictions to the published experimental values.
Step 1: Get the Structure
First, we need a 3D structure. We can either download the crystal structure from the PDB or predict it with ESMFold. For this example, we will use ESMFold to demonstrate the full computational workflow:
import scirouter
client = scirouter.SciRouter()
# T4 Lysozyme sequence
t4l_seq = (
"MNIFEMLRIDEGLRLKIYKDTEGYYTIGIGHLLTKSPSLNAAKSELD"
"KAIGRNCNGVITKDEAEKLFNQDVDAAVRGILRNAKLKPVYDSLDAV"
"RRAALINMVFQMGETGVAGFTNSLRMLQQKRWDEAAVNLAKSRWYND"
"QTPNRAKRVITTFRTGTWDAYKNL"
)
# Predict structure
fold_job = client.proteins.fold(sequence=t4l_seq, model="esmfold")
fold_result = client.proteins.fold_result(fold_job.job_id, poll=True)
print(f"Mean pLDDT: {fold_result.mean_plddt:.1f}")
pdb_string = fold_result.pdbStep 2: Score Individual Mutations
Now we score 10 well-known T4 lysozyme mutations. These include both stabilizing mutations (like the engineered cavity-filling mutations) and destabilizing mutations (like the alanine-scanning mutations that remove side chain contacts):
# Define mutations to score
# Format: (wild_type_aa, position, mutant_aa, expected_ddg_kcal_mol)
mutations = [
("L", 99, "A", 5.1), # Destabilizing: removes buried leucine
("A", 98, "V", -1.2), # Stabilizing: fills cavity
("T", 157, "I", -1.4), # Stabilizing: hydrophobic packing
("S", 117, "A", 0.3), # Mildly destabilizing
("A", 42, "G", 2.8), # Destabilizing: glycine flexibility
("V", 149, "I", -0.6), # Mildly stabilizing
("I", 3, "A", 3.9), # Destabilizing: core disruption
("L", 133, "A", 4.2), # Destabilizing: removes buried contact
("M", 102, "L", -0.3), # Near-neutral
("A", 82, "P", 1.7), # Destabilizing: proline strain
]
print(f"{'Mutation':<12} {'Predicted DDG':>14} {'Experimental':>13} {'Match':>6}")
print("-" * 50)
for wt, pos, mut, exp_ddg in mutations:
result = client.design.stability(
pdb=pdb_string,
mutations=[f"{wt}{pos}{mut}"],
)
pred_ddg = result.ddg
direction_match = (pred_ddg > 0 and exp_ddg > 0) or (pred_ddg < 0 and exp_ddg < 0)
print(
f"{wt}{pos}{mut:<8} {pred_ddg:>+10.2f} {exp_ddg:>+9.2f} "
f"{'Yes' if direction_match else 'No':>5}"
)Step 3: Interpret the Results
When interpreting ThermoMPNN stability predictions, focus on these practical guidelines:
- DDG below -1.0: Strongly predicted as stabilizing. High confidence, good candidate for experimental validation.
- DDG between -1.0 and 0: Mildly stabilizing or neutral. Could go either way experimentally. Test if the mutation is otherwise beneficial (e.g., improves solubility).
- DDG between 0 and +1.0: Mildly destabilizing. May be tolerable if the protein is already very stable.
- DDG above +1.0: Likely destabilizing. Avoid unless there is a strong functional reason to make this mutation.
- DDG above +3.0: Severely destabilizing. Will likely cause misfolding or aggregation.
Scanning an Entire Position
A common workflow in protein engineering is saturation mutagenesis scanning – testing all 20 amino acid substitutions at a single position. ThermoMPNN makes this trivial computationally:
amino_acids = "ACDEFGHIKLMNPQRSTVWY"
position = 99
wild_type = "L"
results = []
for aa in amino_acids:
if aa == wild_type:
continue
result = client.design.stability(
pdb=pdb_string,
mutations=[f"{wild_type}{position}{aa}"],
)
results.append({"mutation": f"{wild_type}{position}{aa}", "ddg": result.ddg})
# Sort by DDG (most stabilizing first)
results.sort(key=lambda x: x["ddg"])
print(f"Saturation scan at position {position} (wild type: {wild_type})")
print(f"{'Mutation':<10} {'DDG (kcal/mol)':>15}")
print("-" * 28)
for r in results:
marker = " <-- stabilizing" if r["ddg"] < -0.5 else ""
print(f"{r['mutation']:<10} {r['ddg']:>+12.2f}{marker}")Combining Stability with Sequence Design
ThermoMPNN stability predictions pair naturally with ProteinMPNN sequence design. A typical workflow is:
- Design: Use ProteinMPNN to generate candidate sequences for a target backbone structure
- Score: Run ThermoMPNN on each designed sequence to predict stability relative to wild type
- Filter: Keep only designs with predicted DDG below a threshold (e.g., -0.5 kcal/mol for each mutation)
- Validate: Predict structures of top candidates with ESMFold to verify they fold correctly
# Step 1: Design sequences with ProteinMPNN
design_result = client.design.proteinmpnn(
pdb=pdb_string,
num_designs=50,
temperature=0.1,
)
# Step 2: Score each design for stability
scored_designs = []
for design in design_result.designs:
# Identify mutations relative to wild type
mutations = []
for i, (wt, mut) in enumerate(zip(t4l_seq, design.sequence)):
if wt != mut:
mutations.append(f"{wt}{i+1}{mut}")
if not mutations:
continue
stability = client.design.stability(
pdb=pdb_string,
mutations=mutations,
)
scored_designs.append({
"sequence": design.sequence,
"mutations": mutations,
"total_ddg": stability.ddg,
"recovery": design.sequence_recovery,
})
# Step 3: Filter and rank
stable_designs = [d for d in scored_designs if d["total_ddg"] < 0]
stable_designs.sort(key=lambda x: x["total_ddg"])
print(f"Stabilizing designs: {len(stable_designs)} / {len(scored_designs)}")
for d in stable_designs[:5]:
print(f" DDG: {d['total_ddg']:+.2f} kcal/mol, "
f"{len(d['mutations'])} mutations, "
f"recovery: {d['recovery']:.1%}")Combining Stability with Solubility
A mutation that improves thermostability but causes the protein to aggregate in solution is not useful in practice. This is a common failure mode – hydrophobic core-packing mutations often improve DDG but can expose hydrophobic surface area that promotes aggregation.
The solution is to combine stability prediction with solubility prediction. Run both ThermoMPNN and SoluProt on your candidate mutations and only advance variants that score well on both metrics:
# Score mutations for both stability and solubility
candidates = [
("A", 98, "V"),
("T", 157, "I"),
("V", 149, "I"),
("M", 102, "L"),
("S", 117, "V"),
]
print(f"{'Mutation':<12} {'DDG':>8} {'Solubility':>11} {'Verdict':>10}")
print("-" * 45)
for wt, pos, mut in candidates:
# Stability
stab = client.design.stability(
pdb=pdb_string,
mutations=[f"{wt}{pos}{mut}"],
)
# Solubility (on mutant sequence)
mutant_seq = t4l_seq[:pos-1] + mut + t4l_seq[pos:]
sol = client.design.solubility(sequence=mutant_seq)
verdict = "PASS" if stab.ddg < 0 and sol.score > 0.5 else "REVIEW"
print(f"{wt}{pos}{mut:<8} {stab.ddg:>+7.2f} {sol.score:>10.2f} {verdict:>10}")Common Pitfalls in Stability Prediction
1. Ignoring Structural Context
Stability predictions are only as good as the input structure. If your structure has low confidence in the region around the mutation site (pLDDT below 70), the DDG prediction will be unreliable. Always check the local confidence score before trusting a stability prediction.
2. Assuming Additivity for Multi-Mutants
Summing individual DDG values for multi-mutant designs works as a rough approximation, but can fail when mutations are structurally coupled. Two mutations that each stabilize by -1.0 kcal/mol individually might only stabilize by -1.2 kcal/mol when combined (negative epistasis) or by -2.5 kcal/mol (positive epistasis). Experimental validation of multi-mutant combinations is essential.
3. Over-Optimizing for Stability Alone
Maximally stable proteins are not always the most useful. Over-stabilization can reduce conformational flexibility needed for catalysis, reduce solubility, or impair protein-protein interactions. Target a modest stability improvement (DDG of -2 to -5 kcal/mol total) rather than engineering a thermophilic brick.
Next Steps
Stability prediction is one component of a complete protein engineering toolkit. To build a full optimization pipeline, combine it with:
- ESMFold for structure prediction of designed variants
- ProteinMPNN for automated sequence design
- Solubility prediction to ensure your stabilized proteins will express
Sign up for a free SciRouter API key and start scoring mutations today. 500 free credits per month – enough to screen hundreds of candidate mutations.