DNA ModelsDNA Foundation Models

DNA Foundation Models in 2026: Evo, Caduceus, Nucleotide Transformer

The complete landscape of DNA foundation models in 2026. Evo 2, Caduceus, Nucleotide Transformer, HyenaDNA, DNABERT.

SciRouter Team
April 11, 2026
14 min read

DNA foundation models have gone from a handful of academic projects to a real field in a very short time. In 2026 there are at least half a dozen serious models competing for the title of “best DNA language model,” and the right choice depends as much on your compute budget and task as on raw capability. This guide maps out the landscape, compares the major contenders, and gives you a practical framework for picking one.

Evo 2 is the most prominent of the bunch and is available through the SciRouter DNA Lab, but it is only one piece of a broader picture that includes Caduceus, Nucleotide Transformer, HyenaDNA, DNABERT, and several specialized derivatives.

Note
A single “best” model does not exist for every task. The useful framing is: what am I trying to do, how long is my context, and what compute can I afford? Pick accordingly.

The five models you need to know

1. Evo 2 — the scale champion

Released by the Arc Institute in early 2025 as a successor to the original Evo model. Trained on roughly 9 trillion base pairs of sequence, with a context window near one million tokens, and parameter counts in the tens of billions. It is an autoregressive transformer-style model suited for variant effect prediction, regulatory annotation, and generative sequence design.

Strengths: zero-shot variant scoring, long-range regulatory effects, generative design. Weaknesses: compute cost. Evo 2 at full context needs serious GPU resources, which is why hosted APIs exist.

2. Caduceus — the efficient long-context model

Caduceus is a DNA foundation model built on Mamba-style state-space layers instead of attention. The key idea is that state-space models scale linearly with sequence length while attention scales quadratically, which makes very long genomic contexts tractable without enormous compute budgets.

Strengths: long context at low compute cost, reverse-complement equivariance baked in, efficient inference. Weaknesses: smaller ecosystem than the transformer-based models, fewer pretrained variants.

3. Nucleotide Transformer — the multi-species baseline

Nucleotide Transformer (from InstaDeep) was one of the first transformer-based DNA models trained on a broad multi-species corpus. It helped establish the pattern for DNA language model pretraining and fine-tuning, and it is still a common starting point for task-specific fine-tuning.

Strengths: strong baseline, well-documented, good fine-tuning recipes. Weaknesses: shorter context window than Evo 2 or Caduceus, smaller parameter count.

4. HyenaDNA — the long-context pioneer

HyenaDNA was among the first DNA models to demonstrate that you could push context to hundreds of thousands or millions of base pairs using Hyena operators in place of attention. It paved the way for Caduceus and the long-context generation of models more broadly.

Strengths: very long context at moderate cost, simple to run. Weaknesses: zero-shot performance trails the newer models on most benchmarks.

5. DNABERT — the historical baseline

DNABERT is a BERT-style masked language model trained on k-mer tokens of human DNA. It was one of the first DNA transformers and defined the early baseline for the field. It is no longer state-of-the-art but remains a useful cheap comparison and a reasonable starting point for small-scale tasks.

Strengths: small, cheap, well-understood. Weaknesses: short context, k-mer tokenization, surpassed on most benchmarks.

How they differ architecturally

Three architectural axes separate these models from each other:

  • Attention vs state-space. Evo 2, Nucleotide Transformer, and DNABERT use attention. Caduceus and HyenaDNA use efficient alternatives. Efficient alternatives win on long contexts; attention still wins on raw expressivity at shorter contexts.
  • Autoregressive vs masked. Evo 2 is autoregressive, DNABERT is masked, others vary. Autoregressive models are better for generation and for likelihood-ratio variant scoring. Masked models are better for feature extraction.
  • Tokenization. DNABERT uses k-mer tokens. Modern models increasingly use single-nucleotide tokens, which are cleaner for variant scoring and do not require alignment of token boundaries with variant positions.

How they differ by training corpus

Training corpus drives zero-shot capability as much as architecture does. Evo 2's trillions of base pairs give it exposure to a huge slice of evolution, which translates into stronger variant effect prediction in regions the model has never seen before. Smaller corpora produce models that are still useful but narrower in their zero-shot capabilities.

Use-case map

Variant effect prediction

Evo 2 is the default. AlphaGenome is competitive on regulatory variants. For coding variants, pair with a protein-level model like ESM-2. DNABERT and Nucleotide Transformer are weaker options here.

Regulatory element annotation

AlphaGenome shines because it was trained with functional supervision. Evo 2's per-position likelihoods provide a good second opinion. Caduceus's long context makes it useful for enhancer-gene linking.

Sequence generation and design

Evo 2 is the clear leader, because autoregressive generation is a natural capability of its training objective. HyenaDNA can generate but is weaker at complex conditional generation.

Population-scale scoring

DNABERT or Nucleotide Transformer for cheap baseline scoring, Caduceus for longer-range effects at moderate cost, Evo 2 for high-quality scores when compute allows.

Single-species deep analysis

Fine-tuned Nucleotide Transformer or HyenaDNA remain competitive when you have enough labeled data to specialize the model. Evo 2 can also be fine-tuned but is usually overkill for species-specific tasks.

Compute and accessibility

  • DNABERT, HyenaDNA, Nucleotide Transformer: run on a single GPU. Sometimes even on a strong CPU for small batches.
  • Caduceus: moderate GPU requirements thanks to state-space efficiency.
  • AlphaGenome: serious GPU requirements but tractable for most research groups.
  • Evo 2: multi-GPU setup for full capability. Hosted APIs are the practical path for most users.

A practical decision guide

If compute is abundant…

Default to Evo 2 for almost anything. It is the richest model and produces the strongest zero-shot outputs across tasks.

If compute is limited…

Use a hosted Evo 2 endpoint for headline analyses and fall back to Caduceus or HyenaDNA for anything you need to run yourself.

If your context is short…

Nucleotide Transformer is still a solid choice. The long-context advantages of Evo 2 and Caduceus disappear when your variants live in a small window.

If you need generation…

Use Evo 2. No other DNA model has the same generation capabilities today.

The role of SciRouter

SciRouter hosts Evo 2 behind the DNA Lab endpoint so you can call it from a laptop without provisioning GPUs. See the Evo 2 tool page for the request schema. Other DNA models in the list are not currently hosted, but the gateway pattern means they can be added over time without changing your client code.

Bottom line

The DNA foundation model landscape in 2026 is finally mature enough to have genuine options. Evo 2 leads on capability, Caduceus on efficiency, AlphaGenome on regulatory annotation, Nucleotide Transformer and HyenaDNA on accessible baselines, and DNABERT remains the cheap fallback. Match the model to your task and your compute budget, and you will spend less time fighting the tools and more time doing the biology.

Explore DNA Lab →

Frequently Asked Questions

How many DNA foundation models are worth tracking in 2026?

About five to seven, depending on how strictly you define foundation model. Evo 2, Caduceus, Nucleotide Transformer, HyenaDNA, DNABERT, AlphaGenome, and the newer species-specific models like MetaProteome are the ones that come up in most conversations.

What is the single best DNA model to learn first?

Evo 2 is the richest starting point because it combines scale, long context, and autoregressive capabilities. If you only learn one model in detail, learn how to use Evo 2 for variant scoring and sequence generation.

Are DNA foundation models better than classical tools?

For zero-shot variant effect prediction and genome-wide functional annotation, yes, foundation models have surpassed most classical sequence-based tools. For specific well-studied tasks like splice site prediction, dedicated classical models can still compete.

What is Caduceus and why is it different?

Caduceus is a DNA model built on Mamba-style state-space layers rather than attention. It was one of the first to show that efficient sequence models could handle genome-scale contexts without the cost of full attention. It remains a strong choice when compute is tight.

What happened to DNABERT?

DNABERT is still used as a baseline and for small-scale tasks where compute is limited, but it has been thoroughly surpassed on most benchmarks by the newer generation. Think of it as the BERT of DNA models — historically important, still useful, but no longer the frontier.

Which model is easiest to run myself?

HyenaDNA and DNABERT are the lightest. Evo 2 at its full size is impractical on anything smaller than multi-GPU A100 setups. For Evo 2 specifically, a hosted API is the easiest path.

Try this yourself

500 free credits. No credit card required.