ClimateClimate Foundation Models

Foundation Models for Climate Science 2026: The Landscape

The complete landscape of foundation models for climate science in 2026. Aurora, GraphCast, Pangu-Weather, ClimaX, and more.

SciRouter Team
April 11, 2026
13 min read

Foundation models are now unambiguously a thing in climate science. In the space of a few years, the field has gone from “data-driven weather models are interesting” to “several independent groups have released foundation models covering forecasting, air quality, geospatial imagery, and more.” This post is a survey of the 2026 landscape and where it looks like things are heading.

We will cover the models that show up in most serious conversations — Aurora, GraphCast, Pangu-Weather, ClimaX, Prithvi, and FourCastNet — and look at what each one is good for, where they overlap, and what the open research problems still are. We will close with how you can actually use these models today through SciRouter's climate lab.

Note
This is a landscape survey, not a benchmark post. For a head-to-head benchmark of the three leading weather forecasters, see Aurora vs GraphCast vs Pangu-Weather.

The atmospheric forecasters

The most mature category. These are models that take an atmospheric state and forecast it forward. They are trained on reanalysis data, primarily ERA5, and they beat or match operational numerical weather prediction at medium range while running orders of magnitude faster.

Aurora (Microsoft)

The most explicitly foundation-model framed of the group. A 3D Swin Transformer backbone is pretrained on ERA5 and then fine-tuned into heads for forecasting, air quality, and cyclones. Aurora's defining feature is coverage — one backbone, many downstream tasks. Available through SciRouter.

GraphCast (Google DeepMind)

A graph neural network over an icosahedral grid. GraphCast is known for its strong deterministic medium-range forecasting accuracy and for the careful ablations in the original paper. It is less explicitly framed as a foundation model but acts as one in practice — the architecture generalizes to many forecasting variables.

Pangu-Weather (Huawei)

A 3D Earth-specific Transformer with hierarchical lead-time stacking. Pangu was among the first data-driven models to clearly beat operational NWP on standard benchmarks. It remains a strong choice for forecasting at specific short and medium lead times.

FourCastNet (NVIDIA)

One of the earliest global data-driven forecasting models. FourCastNet uses a Fourier neural operator architecture and was one of the first demonstrations that a trained neural network could run a global forecast on a single GPU in seconds.

The multi-task climate models

ClimaX

ClimaX, from a Microsoft Research and academic collaboration, is explicitly designed as a climate foundation model. It pretrains on a mixture of reanalysis and simulation data and fine-tunes to several tasks including forecasting, climate downscaling, and climate projection. It is an earlier example of the foundation-model pattern applied specifically to climate as opposed to weather.

Prithvi (NASA and IBM)

Prithvi is a geospatial foundation model trained primarily on satellite imagery. It is useful for tasks like land-use classification, flood mapping, burn-scar detection, and crop monitoring. It is not a weather forecaster — its input is imagery, not atmospheric state — but it is part of the same foundation-model wave in climate and earth observation.

The axes that matter

When you look at this list, three axes show up repeatedly.

  • Input modality. Atmospheric state (reanalysis) versus imagery (satellite) versus mixed. The models cluster cleanly by this axis.
  • Task coverage. Single-task (forecasting only) versus multi-head (forecasting plus other atmospheric tasks). Aurora and ClimaX are on the multi-head side.
  • Compute cost. Pretraining is expensive; inference is dramatically cheaper than traditional NWP. This is the axis that enables the application explosion.

Open problems

Physical consistency

Data-driven models can sometimes produce output that is statistically good but physically inconsistent — conservation laws slightly violated, energy balance slightly off, hydrostatic balance not quite right. For applications that propagate forecasts many steps forward, these small inconsistencies can accumulate. Research on physics-informed training losses is active.

Ensemble calibration

Single deterministic forecasts are not enough for risk assessment — you need probabilistic output. Data-driven models are catching up to NWP ensembles in this area, but calibration is a live problem.

Out-of-distribution generalization

The climate is changing in ways that are, by definition, out of distribution relative to historical reanalysis. How well data-driven models extrapolate to a warmer world is a real open question.

Integration with physics-based models

The most likely future is hybrid. Physics-based models provide consistency, long time scales, and mechanism. Data- driven models provide speed, flexibility, and inference on new kinds of inputs. Building systems that combine both is an ongoing research agenda.

Warning
None of these foundation models is a complete replacement for operational climate and weather infrastructure. They are additive tools that make new applications possible, not a shortcut past the need for physics-based modeling.

How to actually use them today

The easiest path in 2026 is a hosted endpoint. SciRouter exposes Aurora through the climate lab, so you can call forecasts, air quality prediction, and cyclone tracking from ordinary HTTP clients without provisioning a single GPU. For other models, the access picture is more varied — some are open-weight with inference scripts, some are research releases that require reimplementation, and some are only reachable through specific partners.

If you are building an application, start with whatever is most accessible. You can swap in a better model later if the interface is well defined. The interface is more valuable than the specific weights.

Where the field is heading

  • More fine-tuning heads. Expect regional forecasting, precipitation, nowcasting, and specialized variables to show up as heads on existing backbones.
  • Better ensembles. Data-driven models will catch up to NWP on probabilistic forecasting, closing one of the main remaining gaps.
  • Hybrid architectures. Models that embed physics-based constraints into the training loss or architecture will become more common.
  • Agent-native workflows. As MCP and similar protocols mature, agents will compose climate foundation models into larger reasoning loops.

Bottom line

The climate-science foundation-model landscape in 2026 is more crowded and more useful than it was even a year ago. Aurora, GraphCast, Pangu-Weather, ClimaX, Prithvi, and FourCastNet are all doing interesting things, and the patterns they share — pretraining on reanalysis or imagery, fine-tuning to task-specific heads, inference in seconds instead of hours — suggest that climate AI is following the same trajectory that natural language AI did a few years earlier.

Open the SciRouter Climate Lab →

Frequently Asked Questions

What is a foundation model in climate science?

The same idea as in NLP. A large neural network is pretrained on a broad corpus — atmospheric reanalysis, satellite imagery, or both — and then fine-tuned into task-specific heads. The foundation-model framing is attractive for climate because there are many related tasks (forecasting, downscaling, air quality, extremes attribution) that all share the same underlying physics.

Which foundation models matter in 2026?

Aurora from Microsoft, GraphCast from Google DeepMind, Pangu-Weather from Huawei, ClimaX from a Microsoft/academic collaboration, Prithvi from NASA and IBM, and FourCastNet from NVIDIA are the commonly cited names. Not all of them are framed as foundation models, but all of them act as one in practice.

Do these models replace traditional climate models?

No. Traditional Earth system models simulate physics, chemistry, and biology from first principles and are irreplaceable for questions that require physical consistency, long-time scale projections, and attribution to underlying mechanisms. Foundation models are a new layer on top that makes fast inference, scenario exploration, and data-driven downscaling possible in ways that were previously limited by compute.

What are the biggest open problems?

Physical consistency, ensemble calibration, out-of-distribution generalization, downscaling to local resolution, and the integration of data-driven and physics-based approaches. Each of these is an active research area in 2026.

Can I use these models without a GPU?

Yes, through hosted APIs. SciRouter's climate lab exposes Aurora inference without any local GPU requirement. For GraphCast and others you typically need either a paper-provided inference script or a local GPU, though hosted options are growing.

Is there an agentic angle?

Yes. Once foundation models are fast and available through APIs, agents can chain them into workflows. An agent could pull a current atmospheric state, run a forecast, query an air quality head, and produce a natural-language summary, all in seconds. MCP servers like SciRouter's make this kind of chaining straightforward.

Where should I start?

Start with a hosted endpoint that lets you play with the model before committing to infrastructure. The SciRouter climate lab is one option for Aurora. From there, read the original papers for the architectures you want to understand in depth, and only set up local inference when you need to.

Try this yourself

500 free credits. No credit card required.