DNA ModelsDNA Foundation Models

Zero-Shot BRCA1 Variant Prediction with Evo 2: A Tutorial

A complete tutorial for scoring BRCA1 variants with Evo 2 zero-shot. Loading the reference, defining variants, interpreting scores.

SciRouter Team
April 11, 2026
13 min read

BRCA1 is the perfect sandbox for DNA variant effect prediction. Thousands of annotated variants in ClinVar, a saturation mutagenesis study that measured functional effect for essentially every possible single-nucleotide change in the BRCT domain, and clear clinical relevance. This tutorial walks through how to use Evo 2 to score BRCA1 variants zero-shot — no training, no fine-tuning, just a few API calls.

By the end you will have a Python script that accepts a list of BRCA1 variants and returns Evo 2 likelihood-ratio scores, plus a simple way to compare those scores against ClinVar labels. The whole workflow runs through the SciRouter DNA Lab API with no local GPU required.

Warning
This tutorial is for research and educational use only. Do not use these scores for clinical diagnosis. Clinical variant interpretation requires ACMG-compliant review with orthogonal evidence.

Background: what we are measuring

Evo 2 is an autoregressive DNA language model trained on roughly 9 trillion base pairs of sequence. Because it was trained with next-token prediction, it can tell you how likely any given nucleotide is given the surrounding context. Swap one nucleotide for another, and the ratio of likelihoods is a zero-shot score for how unusual the variant looks.

Large negative log-likelihood ratios flag variants that would violate patterns the model has learned from evolution. These tend to overlap with functional variants in curated databases. That is the entire trick — no labels, no fine-tuning, just next-token likelihoods.

Step 1 — grab the BRCA1 reference sequence

Evo 2 needs a reference sequence to score variants against. For BRCA1, you can fetch the canonical transcript from NCBI or Ensembl. A local FASTA file is easiest:

python
from pathlib import Path

def load_fasta(path: str) -> str:
    lines = Path(path).read_text().splitlines()
    return "".join(l.strip() for l in lines if not l.startswith(">"))

brca1_ref = load_fasta("BRCA1_NM_007294.fasta")
print(f"loaded {len(brca1_ref)} bases")

The exact file you use depends on which transcript you care about. For most clinical work, NM_007294 (the canonical BRCA1 transcript) is the right choice.

Step 2 — define your variant list

Represent each variant as a position (0-indexed into the reference you loaded) plus an alternate allele. A helper dataclass keeps things tidy:

python
from dataclasses import dataclass

@dataclass
class Variant:
    position: int    # 0-indexed into the reference sequence
    ref: str         # reference base
    alt: str         # alternate base
    label: str = ""  # optional: ClinVar category for later comparison

variants = [
    Variant(position=1234, ref="C", alt="T", label="pathogenic"),
    Variant(position=2500, ref="A", alt="G", label="benign"),
    # ... plus however many you want to score
]

Step 3 — call the Evo 2 scoring endpoint

The DNA Lab endpoint accepts a reference sequence and a list of variants, and returns one score per variant:

python
import requests

API_URL = "https://scirouter-gateway-production.up.railway.app/v1/dna/evo2/variant-score"
API_KEY = "sk-sci-your-api-key-here"

def score_variants(ref_seq: str, variants: list[Variant], window: int = 2048):
    """Score a batch of variants with Evo 2 zero-shot."""
    payload = {
        "reference": ref_seq,
        "context_window": window,
        "variants": [
            {"position": v.position, "ref": v.ref, "alt": v.alt}
            for v in variants
        ],
    }
    headers = {"Authorization": f"Bearer {API_KEY}"}
    r = requests.post(API_URL, json=payload, headers=headers, timeout=300)
    r.raise_for_status()
    return r.json()["scores"]

scores = score_variants(brca1_ref, variants)
for v, s in zip(variants, scores):
    print(
        f"pos {v.position} {v.ref}>{v.alt} "
        f"log-ratio {s['log_likelihood_ratio']:.2f} "
        f"({v.label})"
    )

Each entry in the response contains the log-likelihood ratio, the context window used, and the raw per-token likelihoods if you asked for them. For most workflows, the log ratio is what you care about.

Step 4 — interpret the scores

Three bands are a reasonable starting point for BRCA1:

  • Log ratio < -3: the variant is strongly disfavored by the model. High prior probability of functional impact.
  • -3 ≤ log ratio ≤ -1: uncertain region. The variant is somewhat unusual but not dramatically so. Prioritize for follow-up.
  • Log ratio > -1: the variant looks essentially neutral to the model. Lower prior probability of functional impact.
Note
These thresholds are heuristic and gene-dependent. Calibrate them against a labeled reference set for your specific use case before relying on them for prioritization.

Step 5 — compare to ClinVar

ClinVar provides a pathogenicity label for many known BRCA1 variants. If you join your variant list with a ClinVar download, you can check how well the Evo 2 scores separate pathogenic from benign calls.

python
import pandas as pd

df = pd.DataFrame([
    {
        "position": v.position,
        "ref": v.ref,
        "alt": v.alt,
        "clinvar": v.label,
        "evo2_log_ratio": s["log_likelihood_ratio"],
    }
    for v, s in zip(variants, scores)
])

# quick separation check
import seaborn as sns
import matplotlib.pyplot as plt

sns.boxplot(data=df, x="clinvar", y="evo2_log_ratio")
plt.axhline(-3, color="red", linestyle="--", label="-3 threshold")
plt.legend()
plt.savefig("brca1_evo2_vs_clinvar.png", dpi=150)

You should see pathogenic variants clustering toward more negative log ratios, benign variants near zero, and VUS (variants of uncertain significance) spread across the range. That spread on VUS is where the method is genuinely useful: it gives you a prioritization signal that lets you focus follow-up work on the most unusual VUS first.

Workflow extensions

Scoring all single-nucleotide variants in an exon

For a systematic scan, generate every possible single-nucleotide substitution across an exon and score them all. Evo 2's batch-friendly API makes this tractable for reasonably sized regions.

Combining with protein-level scores

For missense variants, pair Evo 2 with ESM-2 scores on the translated protein. Variants that are unusual at both the DNA and protein levels are especially strong candidates for functional follow-up.

Scoring splice variants

Evo 2 handles splice-disrupting variants implicitly because the likelihood ratio picks up on splice-site patterns. For stronger splice-specific scores, combine with a dedicated splice model.

Common pitfalls

  • Wrong coordinate system. Make sure your variant positions are 0-indexed and match the reference sequence you loaded. A one-base shift will silently produce wrong scores for every variant.
  • Short context windows. Too short a context can miss regulatory effects. Aim for at least a few hundred bases on each side of the variant.
  • Interpreting scores as diagnoses. Evo 2 scores are priors, not labels. Use them to rank candidates.

Running the whole workflow

The DNA Lab workspace wraps this scoring endpoint in a browser UI if you want to explore before scripting. The Evo 2 tool page documents the exact request and response schemas.

Bottom line

Zero-shot BRCA1 variant scoring used to require fine-tuning a dedicated model on labeled data. With Evo 2, it is a handful of API calls. The likelihood-ratio output is intuitive, the API is simple, and the separation against ClinVar labels is strong enough to be useful for prioritization. For any BRCA1-related research project, it is the cheapest first-pass method available today.

Open DNA Lab →

Frequently Asked Questions

Why BRCA1 for this tutorial?

BRCA1 is the canonical cancer-risk gene with thousands of annotated variants in ClinVar, a deep mutational scanning study by Findlay et al., and strong clinical relevance. That combination makes it the perfect benchmark for a zero-shot method: plenty of ground truth to compare against.

What exactly does Evo 2 return for each variant?

A log-likelihood ratio between the reference and alternate alleles, given the surrounding genomic context. Large negative values suggest the variant is unusual relative to evolution and likely functional. Values near zero suggest the variant is close to neutral from the model's perspective.

How accurate are the zero-shot scores on BRCA1?

In published and community benchmarks, Evo 2 zero-shot scores correlate well with Findlay-style saturation mutagenesis functional readouts and with ClinVar pathogenicity labels. They are not perfect — missense variants in the BRCT domain are the hardest — but they are far from random.

Can I use this to diagnose a patient?

No. This is a research workflow. Clinical variant interpretation requires orthogonal evidence, professional review, and adherence to ACMG guidelines. Use Evo 2 scores to prioritize variants for further analysis, not to issue diagnoses.

How long does it take to score a single variant?

On the hosted endpoint, a single BRCA1 variant takes a few seconds once the model is warm. Batches of hundreds of variants run in well under a minute.

Do I need the full BRCA1 sequence as context?

A window of a few hundred to a few thousand base pairs centered on the variant is usually enough. The endpoint accepts longer context if you want to capture more regulatory information, at the cost of compute.

Try this yourself

500 free credits. No credit card required.