ClimateClimate Foundation Models

Aurora vs GraphCast vs Pangu-Weather: 2026 AI Weather Benchmark

Head-to-head benchmark of the three leading AI weather models in 2026. Accuracy, speed, and best use cases.

SciRouter Team
April 11, 2026
11 min read

Three data-driven weather models have become the reference points for AI meteorology: Microsoft's Aurora, Google DeepMind's GraphCast, and Huawei's Pangu-Weather. All three are trained on ERA5, all three forecast the global atmosphere in seconds on a single GPU, and all three are competitive with — and in many cases better than — physics- based operational forecasts at medium range. But they are not identical. This post walks through how they differ and what each one is best at.

Note
This is a high-level comparison. For a deeper walkthrough of Aurora's architecture and training, see Aurora Explained.

The three contenders

Aurora

Microsoft's Aurora is framed explicitly as an atmospheric foundation model. A 3D Swin Transformer backbone is pretrained on ERA5 and other atmospheric data, then fine-tuned into multiple heads: medium-range weather forecasting, global air quality prediction, and tropical cyclone tracking. The foundation-model framing is what sets it apart — one backbone, many applications.

GraphCast

Google DeepMind's GraphCast uses a graph neural network over an icosahedral grid of the globe. Each grid point is a node, and edges connect neighboring points at multiple resolutions. The model iterates the grid forward in time and predicts next-state variables at each step. GraphCast is notable for its strong deterministic forecasting performance versus the ECMWF Integrated Forecast System and for the careful ablations in the original paper.

Pangu-Weather

Huawei's Pangu-Weather uses a 3D Earth-specific Transformer with a hierarchical temporal stacking strategy: separate models trained for different lead times (1 hour, 3 hours, 6 hours, 24 hours) that can be composed to reach any horizon. Pangu was one of the first data-driven models to clearly beat operational NWP on standard benchmarks at specific lead times, and it kicked off the wave of papers that followed.

Head-to-head on forecasting accuracy

All three models are strong on the headline deterministic forecasting metrics — root mean squared error on 500 hPa geopotential and 2 m temperature being the most commonly reported. The ranking depends on lead time and variable.

  • Short lead times (1-3 days). Pangu-Weather and GraphCast are typically strongest. Aurora is competitive.
  • Medium lead times (3-7 days). GraphCast often edges out the others on deterministic benchmarks, though results vary by variable. Aurora catches up on several fields.
  • Long lead times (7-10 days). All three models show degradation, and ensemble-based NWP still has meaningful advantages for probabilistic forecasting.

The practical takeaway: if you are building a pure global medium-range forecasting app, any of the three will work. Look at the variables and lead times your application cares about and pick accordingly.

Head-to-head on speed

The speed story is the part that sold the AI weather community on data-driven models in the first place. A physics-based operational NWP run takes hours on a supercomputer. A GraphCast, Pangu, or Aurora forecast takes seconds on a single GPU. That is a thousand-fold speedup, and it changes what you can do with weather data.

  • You can run an ensemble of hundreds of forecasts interactively and get probabilistic output.
  • You can condition forecasts on real-time observations as they come in.
  • You can put a weather model inside an agent loop and have it respond in seconds.

None of those were practical before data-driven models. All three contenders unlock them. On raw inference speed the differences between the three are small compared to the gap between any of them and physics-based NWP.

Head-to-head on coverage

This is where the models separate. Forecasting is only one task an atmospheric model can do, and each model has taken a different approach to breadth.

GraphCast

Primarily a forecasting model. DeepMind has demonstrated strong results on deterministic forecasts and there is active research on ensemble and downstream uses. The core product is medium-range weather.

Pangu-Weather

Also primarily a forecasting model, with the hierarchical lead-time structure that allows composition to many horizons. Pangu's main contribution was demonstrating that a transformer-based data-driven model could beat operational NWP on a large benchmark set.

Aurora

The broadest coverage of the three. Aurora's foundation-model framing means the same backbone has been used for forecasting, air quality, and cyclone tracking, with more heads likely over time. If your application is wider than “predict the next 5 days of wind and temperature,” Aurora is the most natural starting point.

Which one to pick

The honest answer is: the one you can most easily access for your specific task.

  • If you need medium-range deterministic forecasts and can run the weights locally, GraphCast is a strong default.
  • If you need short-horizon forecasts at specific lead times, Pangu-Weather's hierarchical structure is a natural fit.
  • If you need a single model for multiple atmospheric tasks or you want to avoid GPU management, Aurora through SciRouter's climate lab is the fastest on-ramp.
Warning
None of these models produces the full probabilistic forecast that an ensemble NWP system produces. For safety-critical decisions, combine data-driven forecasts with operational NWP and expert review. The speed of data-driven models complements NWP, it does not replace it.

What 2026 looks like

The next twelve months are going to be interesting. Ensemble variants of all three models are being actively developed, which closes the probabilistic-forecasting gap with NWP. Fine-tuning heads for regional forecasting, nowcasting, and specialized variables like precipitation are under development. The foundation-model framing that Aurora introduced is likely to be adopted by the other research groups.

In the meantime, you can already do serious work with Aurora, GraphCast, or Pangu-Weather depending on your application. The choice is more about access and task fit than about which is objectively best.

Bottom line

Aurora, GraphCast, and Pangu-Weather are all meaningfully better than what was possible three years ago. Aurora wins on coverage and ease of access through SciRouter. GraphCast wins on deterministic medium-range forecasting accuracy. Pangu-Weather wins on hierarchical short-horizon use cases. None of them replace operational NWP, and all of them open applications that NWP speed could not support.

Try Aurora in the Climate Lab →

Frequently Asked Questions

Which AI weather model is most accurate?

It depends on the task and the lead time. GraphCast tends to be the strongest on standard medium-range deterministic forecasting benchmarks against the ECMWF operational system. Aurora is competitive on forecasting and extends into air quality and cyclone tracking. Pangu-Weather is strongest at short lead times and at specific vertical levels. None is universally best.

Which model is fastest?

All three are dramatically faster than physics-based NWP. Pangu-Weather and GraphCast run a 10-day global forecast in seconds on a single GPU. Aurora is in the same ballpark for forecasting and slightly heavier on the fine-tuned heads because they do more work per inference. Exact numbers depend on the hardware you run on.

Are they all trained on ERA5?

Yes. ERA5 is the common pretraining corpus for all three models. It is essentially the reference atmospheric record for data-driven weather. Differences between the models come from architecture choices, training objectives, and what additional data (if any) was added on top of ERA5.

Which model has the broadest application coverage?

Aurora has the broadest published coverage today. It has been demonstrated on medium-range weather, air quality, and tropical cyclone tracking, and the foundation-model framing means more heads are expected. GraphCast and Pangu-Weather are primarily forecasting models, though research extensions exist for both.

Which one should I use for a hobby weather project?

The one you can actually run. If you are new, start with whichever is most accessible through a hosted API. SciRouter exposes Aurora through the climate lab so you can call it from a browser without GPU setup. GraphCast and Pangu-Weather are available in various forms through their respective research releases if you want to run them locally.

Are these models going to replace NWP?

Not in the near term. Data-driven models extend what is possible, but operational meteorology still runs on physics-based models because they provide uncertainty, physical consistency, and long track records for safety-critical decisions. The likely future is hybrid — data-driven models for speed and interactivity, physics-based models for operational guarantees.

Does SciRouter host all three?

SciRouter focuses on Aurora today as part of the climate lab. We expose Aurora inference, air quality, and cyclone prediction through the gateway and the MCP server. Additional weather models may be added as hosted endpoints over time.

Try this yourself

500 free credits. No credit card required.