DNA foundation models have gone from a handful of academic projects to a real field in a very short time. In 2026 there are at least half a dozen serious models competing for the title of “best DNA language model,” and the right choice depends as much on your compute budget and task as on raw capability. This guide maps out the landscape, compares the major contenders, and gives you a practical framework for picking one.
Evo 2 is the most prominent of the bunch and is available through the SciRouter DNA Lab, but it is only one piece of a broader picture that includes Caduceus, Nucleotide Transformer, HyenaDNA, DNABERT, and several specialized derivatives.
The five models you need to know
1. Evo 2 — the scale champion
Released by the Arc Institute in early 2025 as a successor to the original Evo model. Trained on roughly 9 trillion base pairs of sequence, with a context window near one million tokens, and parameter counts in the tens of billions. It is an autoregressive transformer-style model suited for variant effect prediction, regulatory annotation, and generative sequence design.
Strengths: zero-shot variant scoring, long-range regulatory effects, generative design. Weaknesses: compute cost. Evo 2 at full context needs serious GPU resources, which is why hosted APIs exist.
2. Caduceus — the efficient long-context model
Caduceus is a DNA foundation model built on Mamba-style state-space layers instead of attention. The key idea is that state-space models scale linearly with sequence length while attention scales quadratically, which makes very long genomic contexts tractable without enormous compute budgets.
Strengths: long context at low compute cost, reverse-complement equivariance baked in, efficient inference. Weaknesses: smaller ecosystem than the transformer-based models, fewer pretrained variants.
3. Nucleotide Transformer — the multi-species baseline
Nucleotide Transformer (from InstaDeep) was one of the first transformer-based DNA models trained on a broad multi-species corpus. It helped establish the pattern for DNA language model pretraining and fine-tuning, and it is still a common starting point for task-specific fine-tuning.
Strengths: strong baseline, well-documented, good fine-tuning recipes. Weaknesses: shorter context window than Evo 2 or Caduceus, smaller parameter count.
4. HyenaDNA — the long-context pioneer
HyenaDNA was among the first DNA models to demonstrate that you could push context to hundreds of thousands or millions of base pairs using Hyena operators in place of attention. It paved the way for Caduceus and the long-context generation of models more broadly.
Strengths: very long context at moderate cost, simple to run. Weaknesses: zero-shot performance trails the newer models on most benchmarks.
5. DNABERT — the historical baseline
DNABERT is a BERT-style masked language model trained on k-mer tokens of human DNA. It was one of the first DNA transformers and defined the early baseline for the field. It is no longer state-of-the-art but remains a useful cheap comparison and a reasonable starting point for small-scale tasks.
Strengths: small, cheap, well-understood. Weaknesses: short context, k-mer tokenization, surpassed on most benchmarks.
How they differ architecturally
Three architectural axes separate these models from each other:
- Attention vs state-space. Evo 2, Nucleotide Transformer, and DNABERT use attention. Caduceus and HyenaDNA use efficient alternatives. Efficient alternatives win on long contexts; attention still wins on raw expressivity at shorter contexts.
- Autoregressive vs masked. Evo 2 is autoregressive, DNABERT is masked, others vary. Autoregressive models are better for generation and for likelihood-ratio variant scoring. Masked models are better for feature extraction.
- Tokenization. DNABERT uses k-mer tokens. Modern models increasingly use single-nucleotide tokens, which are cleaner for variant scoring and do not require alignment of token boundaries with variant positions.
How they differ by training corpus
Training corpus drives zero-shot capability as much as architecture does. Evo 2's trillions of base pairs give it exposure to a huge slice of evolution, which translates into stronger variant effect prediction in regions the model has never seen before. Smaller corpora produce models that are still useful but narrower in their zero-shot capabilities.
Use-case map
Variant effect prediction
Evo 2 is the default. AlphaGenome is competitive on regulatory variants. For coding variants, pair with a protein-level model like ESM-2. DNABERT and Nucleotide Transformer are weaker options here.
Regulatory element annotation
AlphaGenome shines because it was trained with functional supervision. Evo 2's per-position likelihoods provide a good second opinion. Caduceus's long context makes it useful for enhancer-gene linking.
Sequence generation and design
Evo 2 is the clear leader, because autoregressive generation is a natural capability of its training objective. HyenaDNA can generate but is weaker at complex conditional generation.
Population-scale scoring
DNABERT or Nucleotide Transformer for cheap baseline scoring, Caduceus for longer-range effects at moderate cost, Evo 2 for high-quality scores when compute allows.
Single-species deep analysis
Fine-tuned Nucleotide Transformer or HyenaDNA remain competitive when you have enough labeled data to specialize the model. Evo 2 can also be fine-tuned but is usually overkill for species-specific tasks.
Compute and accessibility
- DNABERT, HyenaDNA, Nucleotide Transformer: run on a single GPU. Sometimes even on a strong CPU for small batches.
- Caduceus: moderate GPU requirements thanks to state-space efficiency.
- AlphaGenome: serious GPU requirements but tractable for most research groups.
- Evo 2: multi-GPU setup for full capability. Hosted APIs are the practical path for most users.
A practical decision guide
If compute is abundant…
Default to Evo 2 for almost anything. It is the richest model and produces the strongest zero-shot outputs across tasks.
If compute is limited…
Use a hosted Evo 2 endpoint for headline analyses and fall back to Caduceus or HyenaDNA for anything you need to run yourself.
If your context is short…
Nucleotide Transformer is still a solid choice. The long-context advantages of Evo 2 and Caduceus disappear when your variants live in a small window.
If you need generation…
Use Evo 2. No other DNA model has the same generation capabilities today.
The role of SciRouter
SciRouter hosts Evo 2 behind the DNA Lab endpoint so you can call it from a laptop without provisioning GPUs. See the Evo 2 tool page for the request schema. Other DNA models in the list are not currently hosted, but the gateway pattern means they can be added over time without changing your client code.
Bottom line
The DNA foundation model landscape in 2026 is finally mature enough to have genuine options. Evo 2 leads on capability, Caduceus on efficiency, AlphaGenome on regulatory annotation, Nucleotide Transformer and HyenaDNA on accessible baselines, and DNABERT remains the cheap fallback. Match the model to your task and your compute budget, and you will spend less time fighting the tools and more time doing the biology.