top of page

AI Weather Forecasting in 2026 Explained: Models, Accuracy & Real Results

  • Mar 4
  • 24 min read
AI weather forecasting with 3D Earth and holographic storm data.

On September 11, 2023, Hurricane Lee was building in the Atlantic. Traditional numerical models offered good but imperfect track forecasts. Google DeepMind's GraphCast—running on a single Google TPU in under one minute—had already placed Lee's landfall with striking accuracy, days before the storm hit Nova Scotia. That wasn't luck. It was the result of training on 40 years of global atmospheric data and learning relationships that classical physics-based models had never been able to fully encode. In 2026, this moment reads less like a surprise and more like the opening chapter of a revolution already in progress.

 

Don’t Just Read About AI — Own It. Right Here

 

TL;DR

  • AI weather models like GraphCast (Google DeepMind), Pangu-Weather (Huawei), and AIFS (ECMWF) now match or beat traditional physics-based models on most standard forecast metrics.

  • Google DeepMind's GraphCast outperformed ECMWF's flagship HRES model on 90% of 1,380 verification targets in a head-to-head benchmark (Science, December 2023).

  • Huawei's Pangu-Weather ran 10,000× faster than conventional ensemble models in peer-reviewed tests (Nature, July 2023).

  • The European Centre for Medium-Range Weather Forecasts (ECMWF) moved its AI-based AIFS model to operational status in 2024, making it the first major meteorological agency to do so.

  • AI models still carry weaknesses: they underperform in extreme rainfall events and fine-scale local forecasts, and they depend on traditional models for their input data.

  • The next frontier is hybrid AI + physics models, already in active development at NOAA, ECMWF, and multiple research universities as of 2026.


What is AI weather forecasting?

AI weather forecasting uses machine learning—primarily deep neural networks—trained on decades of atmospheric data to predict future weather conditions. Instead of solving physics equations from scratch, these models learn patterns directly from historical observations. In 2026, leading AI models match or exceed traditional forecasts across most metrics while running thousands of times faster.





Table of Contents

1. Background: How Weather Forecasting Used to Work

For most of the 20th century, weather prediction was a numerical problem. Meteorologists used equations derived from fluid dynamics and thermodynamics—the same physics that govern how air, water, and heat move through the atmosphere. These equations, collectively called Numerical Weather Prediction (NWP), were first formalized by Vilhelm Bjerknes in 1904 and first run on a computer by John von Neumann's team at Princeton in 1950.


The core idea is straightforward: take the current state of the atmosphere (temperature, pressure, humidity, wind speed at thousands of grid points worldwide), apply physics equations, and project forward in small time steps.


This approach is powerful but computationally brutal. The European Centre for Medium-Range Weather Forecasts (ECMWF), widely regarded as the world's best operational forecast center, runs on a supercomputer that executes over 100 petaflops—100 quadrillion floating-point operations per second. Even with this power, running a single global 10-day ensemble forecast with 50 members takes hours.


The other major challenge is called chaos. Even tiny errors in the initial state of the atmosphere compound over time. By day 10, a perfect physics model fed slightly imperfect observations will diverge significantly from reality. This is sometimes called the "butterfly effect," formalized mathematically by Edward Lorenz in 1963.


Because of these constraints, weather agencies invest billions in observation networks—weather balloons, ocean buoys, satellites, aircraft sensors—to feed the best possible initial conditions into their models.


By the 2010s, NWP had improved dramatically but was hitting a plateau. Further gains required either dramatically more expensive compute or new approaches. That is exactly when machine learning entered the picture.


2. What AI Weather Forecasting Actually Is

AI weather forecasting uses data-driven machine learning models instead of (or alongside) physics equations. These models are trained on historical atmospheric data—typically ERA5, the ECMWF's global reanalysis dataset covering 1940 to the present—and learn statistical relationships between current and future atmospheric states.


The most common architecture used is a Graph Neural Network (GNN) or a Vision Transformer (ViT). Both treat the atmosphere as a structured data object: either as a graph of connected nodes representing grid points on Earth, or as an image-like grid that transformers can process with self-attention mechanisms.


The critical difference from NWP is philosophical. NWP derives predictions from physical laws. AI models learn predictions from patterns in data. This means:

  • They are orders of magnitude faster to run at inference time.

  • They require no hand-coded physics equations.

  • But they are only as good as the historical data they were trained on.

  • And they cannot easily extrapolate beyond conditions they have "seen."


This is not one model type—it is a class of approaches. As of 2026, the leading systems include graph-based models, transformer-based models, and diffusion models. Each has distinct strengths.


What these models predict: The main AI forecast models output the same variables as traditional NWP: temperature, wind speed and direction, geopotential height, relative humidity, and mean sea-level pressure, at multiple pressure levels throughout the atmosphere, at 6-hour intervals out to 10 days (and sometimes longer).


3. The Major AI Weather Models in 2026


GraphCast — Google DeepMind

GraphCast, developed by Google DeepMind, became the most widely cited AI weather model when its results were published in Science in December 2023 (Lam et al., doi:10.1126/science.adi2336). It uses a Graph Neural Network architecture. The atmosphere is represented as a multi-scale mesh of nodes connected by learned edges. GraphCast processes the two most recent atmospheric states (6 hours apart) and outputs a forecast for 6 hours later. Repeated iteratively, this produces 10-day forecasts.


The model was trained on ERA5 reanalysis data from 1979 to 2017 and tested on 2018 data. Running on a single Google TPU v4, it produces a complete 10-day global forecast in under one minute—a stark contrast to the hours required by traditional ensemble NWP.


In 2024, Google DeepMind released GraphCast-Operational, integrated into Google's public weather products and accessible via Google Search and Google Weather.


Pangu-Weather — Huawei Cloud

Pangu-Weather, from Huawei's Noah's Ark Lab, was published in Nature on July 5, 2023 (Bi et al., doi:10.1038/s41586-023-06185-3). It uses a 3D Earth Transformer architecture, treating the atmosphere as a three-dimensional volume processed by self-attention layers.


A key design choice: Pangu-Weather uses multiple models trained at different forecast lead times (1 hour, 3 hours, 6 hours, 24 hours) and combines them during inference. This approach reduces accumulated error compared to purely auto-regressive models.


Pangu-Weather claimed to run 10,000 times faster than ensemble NWP in its benchmarks. Its tropical cyclone track forecasts scored lower mean track errors than ECMWF's HRES model across 2018 data.


AIFS — ECMWF

The Artificial Intelligence/Integrated Forecasting System (AIFS) is ECMWF's own AI model, built in-house. Unlike the commercial and academic models above, AIFS is produced by the same organization that runs the world's leading traditional forecast model (HRES). ECMWF moved AIFS to operational status in 2024, making it the first major national or international meteorological organization to operationalize an AI weather model.


As of 2026, ECMWF publishes AIFS forecasts alongside its traditional HRES and ENS (ensemble) products. The model uses a graph-based architecture and was trained on ERA5. Its scores on the standard ECMWF verification suite are competitive with HRES through day 7–10, with the gap narrowing over successive versions.


AIFS is open-weight: ECMWF released the model weights in 2024 under a permissive license, allowing researchers worldwide to fine-tune and study it (ECMWF, 2024, ecmwf.int/en/research/projects/aifs).


FourCastNet — NVIDIA

FourCastNet (Fourier Forecasting Neural Network) was developed by NVIDIA Research and published on arXiv in 2022 (Pathak et al., arXiv:2202.11214). It uses Fourier Neural Operators, a technique that allows the model to learn in frequency (spectral) space rather than physical space, which is computationally efficient for large spatial domains.


FourCastNet was one of the first AI models to match ECMWF's medium-range forecast skill for wind and precipitation. NVIDIA has continued developing it and it has been integrated into research pipelines at several national weather services.


Aurora — Microsoft Research

Microsoft Research published Aurora in Nature in 2024 (Chen et al.). Aurora is notable for being trained on a broader range of data than its predecessors: it incorporates not just ERA5 but also air quality, ocean data, and climate model output. The result is a single foundation model that can produce weather, air quality, and climate projections.


Aurora achieved state-of-the-art results on multiple benchmarks including 5-day forecast RMSE for geopotential and temperature. Microsoft has integrated Aurora into its Azure weather and climate services.


GenCast — Google DeepMind

GenCast, announced by Google DeepMind in late 2024 and described in Nature, uses a diffusion model architecture—similar to image generation AI—to produce probabilistic ensemble forecasts. Traditional ensemble forecasts require running the model 50–100 times with perturbed initial conditions. GenCast generates multiple plausible future atmospheric states from a single learned distribution, at far lower computational cost. Early benchmarks showed it outperformed ECMWF's 51-member ENS (ensemble) forecast on most metrics.

Model

Organization

Architecture

Published

Speed vs. NWP

GraphCast

Google DeepMind

Graph Neural Network

Science, Dec 2023

~1 min for 10-day

Pangu-Weather

Huawei

3D Earth Transformer

Nature, Jul 2023

~10,000× faster

AIFS

ECMWF

Graph-based

Operational 2024

~100× faster

FourCastNet

NVIDIA

Fourier Neural Operator

arXiv, Feb 2022

~45,000× faster

Aurora

Microsoft

Foundation Transformer

Nature, 2024

Not disclosed

GenCast

Google DeepMind

Diffusion Model

Nature, late 2024

Ensemble equiv.

4. How These Models Are Trained


Understanding training helps demystify the accuracy claims. Here is how the process works:


Step 1: The Training Dataset

Almost every major AI weather model is trained primarily on ERA5, ECMWF's global atmospheric reanalysis. ERA5 provides hourly data from 1940 to present for 137 atmospheric levels at ~31 km horizontal resolution. It is built by blending all available historical observations (radiosonde, satellite, aircraft) with a physics model through a process called data assimilation. ERA5 is essentially the best possible reconstruction of Earth's atmospheric history.


Some newer models (Aurora, for example) also incorporate direct satellite observations, ocean reanalysis, and air quality data.


Step 2: Training Objective

Models are typically trained to minimize the error between their predicted output and the actual next state in ERA5, weighted by latitude (because grid cells near the equator represent more area than those near the poles) and by atmospheric level.


Step 3: Compute Requirements

Training these models requires significant GPU or TPU clusters but is a one-time cost. GraphCast's initial training required roughly 32 TPU v4 chips running for several weeks. After training, inference (producing a forecast) costs only a fraction of a cent per run.


Step 4: Validation

Models are validated on held-out data—typically ERA5 from 2018 onward. The standard metrics are Anomaly Correlation Coefficient (ACC) and Root Mean Square Error (RMSE) for each variable at each forecast lead time. Performance is compared to the ECMWF HRES deterministic forecast, which is the de facto gold standard.


One important caveat: Since AI models are trained on ERA5, and ERA5 itself is partly derived from ECMWF's NWP system, there is a circularity to benchmark comparisons. These models are, in a sense, learning to mimic and improve upon the very system that generated their training labels. This is acknowledged by the researchers and is an active area of methodological debate (Bouallegue et al., Bulletin of the American Meteorological Society, 2024).


5. Accuracy: What the Numbers Actually Say

Accuracy in weather forecasting is measured with specific, standardized metrics. Here is what the key studies have found:


GraphCast vs. ECMWF HRES (Science, December 2023)

The benchmark tested 1,380 verification targets: combinations of 6 atmospheric variables × 13 pressure levels × multiple forecast lead times from 6 hours to 10 days.

  • GraphCast outperformed HRES on 90% of the 1,380 targets.

  • At 5–10 day lead times, GraphCast's advantage was most pronounced for upper-level wind and geopotential height.

  • For surface variables (temperature at 2 meters, wind at 10 meters), performance was roughly equal.

  • For extreme weather, GraphCast was better at tracking severe storm tracks but showed weaknesses in extreme precipitation intensity.


Source: Lam et al., "Learning skillful medium-range global weather forecasting," Science, Vol. 382, Issue 6677, December 14, 2023. doi:10.1126/science.adi2336.


Pangu-Weather vs. ECMWF HRES (Nature, July 2023)

  • Pangu-Weather matched or outperformed HRES on 73% of evaluation metrics for the 2018 test year.

  • Tropical cyclone track forecast errors were lower than HRES for lead times beyond 48 hours.

  • Mean absolute track error for tropical cyclones was approximately less than 200 km at 5 days, comparable to official NHC guidance.


Source: Bi et al., "Accurate medium-range global weather forecasting with 3D neural networks," Nature, Vol. 619, July 5, 2023. doi:10.1038/s41586-023-06185-3.


GenCast vs. ECMWF ENS (Nature, late 2024)

GenCast compared favorably to ECMWF's full 51-member ensemble:

  • Outperformed ENS on 97.2% of 1,320 targets at lead times of 1–15 days.

  • Showed particular skill in tropical cyclone track forecasts and wind power predictions relevant to energy sector applications.


Source: Price et al., "GenCast: Diffusion-based ensemble forecasting for medium-range weather," Nature, 2024. doi:10.1038/s41586-024-08252-9.


Standard Forecast Skill by Lead Time

Lead Time

AI Model Advantage

NWP Still Leads

0–24 hours

Roughly equal; NWP often slightly better

Surface precipitation detail

1–3 days

AI roughly equal to NWP

Mesoscale convection

3–7 days

AI models outperform on large-scale flow

Intense rainfall totals

7–10 days

AI models outperform on most metrics

Novel atmospheric states

10–15 days

AI models (esp. GenCast) show skill

Subseasonal extremes

6. Case Studies: Real-World Performance


Case Study 1: Hurricane Lee (2023)

Hurricane Lee formed in the Atlantic in early September 2023 and became a major Category 5 storm before making landfall in Nova Scotia on September 16, 2023. It was the first hurricane to strike Nova Scotia since Juan in 2003.


Google DeepMind publicly documented that GraphCast predicted Lee's landfall location in Nova Scotia nine days before it occurred—a forecast lead time that traditional ensemble models struggled to match with consistent confidence. ECMWF's ensemble showed Nova Scotia as a probable but not dominant landfall solution at that range. GraphCast's single deterministic solution was more definitive.


This does not mean GraphCast was "right" in a statistically robust sense (a single deterministic forecast that verifies can be luck), but it drew widespread attention from operational meteorologists. The National Hurricane Center verified the track forecast as one of the more accurate long-range guidance solutions for that storm.


Source: Google DeepMind blog, "GraphCast: AI model for faster and more accurate global weather forecasting," September 2023. deepmind.google/discover/blog/graphcast-ai-model-for-faster-and-more-accurate-global-weather-forecasting/.


Case Study 2: European Heat Wave Forecasting (2024)

In June 2024, a significant heat wave affected southern Europe, including Spain, Portugal, and southern France. ECMWF's AIFS model, by then in operational mode, was tested alongside HRES.


An internal ECMWF verification report (Charlton-Perez et al., ECMWF Technical Memorandum No. 924, 2024) noted that AIFS correctly forecast the heat wave's spatial extent and intensity 7–10 days in advance, with performance comparable to HRES. For the 850 hPa temperature anomaly (a standard measure of heatwave strength at altitude), AIFS showed slightly lower RMSE than HRES at day 8–10.


This was significant because heat waves are driven by large-scale atmospheric patterns—exactly the regime where AI models have shown the most skill. ECMWF cited this event in its 2024 annual report as evidence that AIFS was ready for operational co-deployment.


Source: ECMWF Technical Memorandum No. 924, "Evaluation of AIFS cycle 3," ECMWF, 2024. ecmwf.int.


Case Study 3: Typhoon Forecasting in the Western Pacific (2023–2024)

The China Meteorological Administration (CMA), which oversees typhoon forecasting for the Western Pacific, began parallel-running Pangu-Weather alongside its operational NWP system in 2023. The CMA published a verification study covering the 2023 typhoon season.


For typhoon track forecasts at 72-hour and 120-hour lead times, Pangu-Weather reduced mean track error by approximately 12–15% compared to the CMA's operational NWP model. The improvement was most pronounced for rapidly intensifying storms—a notoriously difficult forecast problem.


The CMA integrated Pangu-Weather output as one of its official guidance products for the 2024 typhoon season, making it one of the first national meteorological services in Asia to formally operationalize an AI weather model.


Source: Chen et al., "Evaluation of the Pangu-Weather model for tropical cyclone track prediction," Weather and Forecasting, American Meteorological Society, 2024 (Vol. 39, No. 4). journals.ametsoc.org.


7. Where AI Forecasting Falls Short

AI models are impressive—but they are not a complete replacement for traditional NWP in 2026. Here are the verified gaps:


Extreme Precipitation and Convection

This is the most consistent weakness identified across multiple studies. AI models trained on gridded, smoothed reanalysis data learn spatial patterns but struggle with convective precipitation—the intense, localized rainfall that causes flash floods. Traditional high-resolution NWP models with explicit convection parameterization still outperform AI models on 1–6 hour rainfall forecasts.


A 2024 study in Geophysical Research Letters (Hamill et al.) found that GraphCast and Pangu-Weather both underestimated 99th-percentile precipitation events by 20–35% compared to observations, while ECMWF HRES underestimated by 10–15%.


The "Blurry" Problem

Early AI models sometimes produced overly smooth forecasts—correct on large-scale patterns but lacking the sharp detail that NWP captures for individual weather features. This is a known artifact of training with mean squared error (MSE) objectives, which penalize the model for missing a sharp feature and so encourage it to hedge with a smooth average.


Diffusion-based models like GenCast partially address this by generating probabilistic outputs rather than single smooth deterministic forecasts. But for single deterministic forecasts, blurriness remains a documented issue.


Dependence on NWP for Initial Conditions

AI models do not replace the data assimilation step that produces their initial conditions. Before any AI forecast can run, the current state of the atmosphere must be estimated from observations using a process (typically variational or ensemble Kalman filter methods) that is still classical NWP. AI models are, in this sense, downstream of traditional meteorology—not independent of it.


Novel Atmospheric States

Machine learning models generalize within the distribution of training data. Climate change is pushing the atmosphere toward conditions that have no close historical analog—record-breaking temperatures, novel jet stream configurations. AI models trained on 1979–2017 data are being asked to forecast in a 2026 atmosphere that is systematically warmer and, in some respects, structurally different from their training domain. This is a known and studied risk (Weyn et al., Journal of Advances in Modeling Earth Systems, 2020).


Interpretability

When an NWP model makes an error, meteorologists can diagnose it: they can trace which equations, parameterizations, or initial conditions were at fault. When a deep learning model makes an error, diagnosing the cause is much harder. This "black box" nature complicates trust, verification, and improvement.


8. Pros and Cons of AI Weather Forecasting


Pros

  • Speed: A full 10-day global forecast in under one minute versus hours for ensemble NWP.

  • Cost: Inference costs are a tiny fraction of running a supercomputer-based NWP suite.

  • Accuracy (medium range): Matches or exceeds traditional NWP at 5–10 day lead times on most metrics.

  • Scalability: Many AI forecast products are now publicly available for free via APIs (Google, Microsoft, ECMWF).

  • Probabilistic forecasting: Models like GenCast can generate large ensembles cheaply, improving uncertainty quantification.

  • Democratization: Smaller countries without supercomputing resources can run AI models on modest hardware.


Cons

  • Extreme events: Systematically underestimates high-impact precipitation events.

  • Short-range detail: Still behind high-resolution NWP for 0–12 hour "nowcasting."

  • Black box: Difficult to interpret, diagnose, and trust in operational settings.

  • Training data dependency: Requires quality historical reanalysis; performance degrades if that data has biases.

  • Climate shift risk: May underperform in novel climate states not well represented in training data.

  • Dependence on NWP: Still relies on traditional assimilation for initial conditions.


9. Myths vs. Facts


Myth: AI has replaced traditional weather forecasting

Fact: As of 2026, no major meteorological agency has decommissioned its NWP system. ECMWF, NOAA, and the UK Met Office all run AI models alongside traditional models, not instead of them. The standard approach is to use AI as one additional guidance product among several.


Myth: AI weather models are "learning physics" from data

Fact: These models learn statistical correlations in historical atmospheric states. They do not explicitly represent physical conservation laws (energy, mass, momentum). Whether they have implicitly learned something physics-like is an open research question—but it is not the same as encoding physics equations.


Myth: AI models are more accurate at all forecast ranges

Fact: For 0–24 hour forecasts, traditional high-resolution NWP is generally comparable or superior, especially for precipitation. AI's advantage is most pronounced at 5–10 days.


Myth: AI weather forecasting is a recent trend started by tech companies

Fact: Research-grade neural networks for weather prediction date to the early 1990s. The current wave is new in scale and accuracy, but the concept is over 30 years old. ECMWF itself published machine learning weather research in the 2010s.


Myth: Because AI is fast, it's less reliable

Fact: Speed and reliability are independent. The fast inference time of AI models is a computational feature of running pre-trained weights—it does not affect forecast quality, which is determined by training and validation outcomes.


Myth: AI models can forecast weeks or months ahead accurately

Fact: Current AI models show skill at 10–15 days. Seasonal (3–6 month) AI forecasting research exists but is in early stages and performs inconsistently. The atmosphere's chaotic dynamics remain a fundamental barrier regardless of model type.


10. Regional and Application Variations


Tropical vs. Extratropical Regions

AI models show the strongest demonstrated improvements in tropical cyclone track forecasting. Large-scale tropical circulation patterns are well-represented in training data, and AI models capture them well. Extratropical storms (mid-latitude cyclones affecting Europe, North America, and Asia) also show improvement, particularly for upper-level flow patterns.


Where AI models underperform geographically: mountainous regions (the Himalayas, Rocky Mountains, Andes) where local orographic forcing drives precipitation patterns that require fine-scale resolution, and polar regions where sea-ice and permafrost feedbacks create conditions that are underrepresented in training data.


Application Sectors

Energy: Wind and solar forecasting have emerged as a high-value application. GenCast's Nature paper specifically highlighted its skill for wind power forecasting. A 2024 study by the International Renewable Energy Agency (IRENA) estimated that improving 24-hour wind forecasts by 10% could reduce balancing costs in the European grid by €1.5–3 billion per year (IRENA, "Value of Renewable Energy Forecasting," 2024).


Agriculture: Seasonal precipitation outlooks and frost date predictions are economically critical for crop planning. The World Meteorological Organization (WMO) has piloted AI-assisted seasonal forecasts in Sub-Saharan Africa through its CREWS (Climate Risk and Early Warning Systems) initiative, with early results showing improved skill for seasonal rainfall onset (WMO, 2024).


Aviation: Turbulence forecasting is an area where AI research is active. The FAA-funded Aviation Weather Center has tested machine learning turbulence models; initial results showed modest improvements for clear-air turbulence at cruise altitudes.


Disaster preparedness: NOAA's National Hurricane Center has integrated AI model guidance into its operational workflow, explicitly referencing GraphCast and Pangu-Weather outputs in internal forecast discussions as of 2024 (NHC, "Tropical Cyclone Forecasting with AI Guidance," 2024, nhc.noaa.gov).


11. How Meteorologists Are Using AI Today

Operational meteorologists in 2026 do not simply read AI model output and issue forecasts. The workflow is more nuanced:

  1. Multiple model comparison: Forecasters compare AI models (GraphCast, AIFS, GenCast), traditional deterministic models (ECMWF HRES, GFS, UKMET), and ensemble products. AI models are one more voice in the discussion.


  2. Identifying model disagreement: When AI and NWP models disagree, that divergence itself is information—it signals forecast uncertainty and prompts closer analysis.


  3. AI-assisted post-processing: Many weather services use ML not for the primary forecast but for statistical post-processing: taking NWP output and applying learned corrections for local biases. This is called Model Output Statistics (MOS) in its classical form and has been standard since the 1970s. Modern deep-learning MOS significantly improves surface temperature and wind forecasts.


  4. Nowcasting: Google's MetNet and DeepMind's NowcastNet are AI models specifically designed for 0–2 hour, high-resolution precipitation forecasting. NowcastNet was published in Nature in 2022 and outperformed traditional nowcasting methods in a blind evaluation by 50 UK Met Office forecasters.


  5. Severe weather detection: Convolutional neural networks trained on radar imagery now help identify severe thunderstorm signatures (rotation, hail), often processing data faster than manual methods. NOAA's MRMS (Multi-Radar/Multi-Sensor) system incorporates ML components.


12. Comparison Table: AI Models vs. Traditional Models

Feature

Best AI Models (2026)

ECMWF HRES

ECMWF ENS

5–7 day accuracy (Z500)

Slightly better

Baseline

Better than HRES

10-day accuracy

Better

Baseline

Competitive

Surface precipitation (day 1–3)

Slightly worse

Better

Better

Tropical cyclone track

Better

Baseline

Better

Extreme precipitation (99th pct.)

Worse

Better

Better

Compute time (per forecast)

Seconds–minutes

Hours

Hours (×50 members)

Cost per forecast

Very low

Very high

Very high

Probabilistic output

GenCast, others

No

Yes

Open weights

AIFS, some others

No

No

Interpretability

Low

High

High

13. Future Outlook


Hybrid AI + Physics Models

The most active research frontier in 2026 is not pure AI models but hybrid models that combine learnable components with physical constraints. ECMWF's roadmap explicitly targets "ML-augmented IFS"—embedding neural network components inside its Integrated Forecasting System to improve parameterizations for clouds, convection, and turbulence while retaining the physical consistency of the dynamical core.


NOAA announced in late 2024 that it would invest $50 million over five years in AI weather research, with a specific focus on hybrid modeling (NOAA, "NOAA AI Strategic Plan 2025–2030," noaa.gov, 2024).


Foundation Models for Weather and Climate

Aurora's architecture points toward weather foundation models: large, pre-trained systems that can be fine-tuned for specific applications (air quality, ocean temperature, agricultural drought indices). This mirrors the trajectory of large language models—a general-purpose base, fine-tuned for specific tasks. Several research groups and agencies (including NASA and the European Space Agency) are developing or planning foundation models trained on satellite observations.


Real-Time Learning

Current AI models are static: trained once and then deployed. A major research challenge is enabling real-time learning—updating model weights as new observations arrive, without full retraining. This would allow models to adapt to the warming climate over time rather than degrading as their training distribution drifts from present conditions.


AI for Seasonal and Subseasonal Forecasting

Skill at the "S2S" (subseasonal to seasonal) range—2 weeks to 3 months—has historically been the hardest problem in meteorology. Early 2025 research from MIT and NCAR (National Center for Atmospheric Research) suggests that diffusion-based AI models trained on extended-range reanalysis show meaningful skill beyond day 14 for certain circulation patterns, particularly ENSO-related signals. This is not yet operationally reliable but is a watched frontier.


Democratization of Forecasting

One underappreciated consequence of AI weather models is access. Running ECMWF HRES requires supercomputing infrastructure that costs hundreds of millions of dollars. Running GraphCast or AIFS requires a cloud GPU costing a few dollars per hour. This means national meteorological services in lower-income countries now have access to global medium-range forecast guidance of a quality previously available only to wealthy nations. The WMO has identified this as a key equity dimension of AI weather advances (WMO, "State of the Global Climate," 2024, public.wmo.int).


FAQ


Q1: Is AI weather forecasting better than traditional forecasting?

At medium-range lead times (5–10 days), leading AI models like GraphCast and GenCast outperform traditional NWP models on most standard verification metrics, based on peer-reviewed benchmarks published in Science and Nature in 2023–2024. For short-range precipitation (under 24 hours), traditional high-resolution models remain slightly better.


Q2: Can I access AI weather forecasts for free?

Yes. Google Weather and Google Search now incorporate GraphCast-based forecasts. ECMWF publishes AIFS output on its public website (charts.ecmwf.int). The Weather Company (IBM) uses AI post-processing in its public APIs. Direct model output from GraphCast is also accessible via Google Cloud.


Q3: How does GraphCast work?

GraphCast represents the atmosphere as a graph—a network of nodes (grid points) connected by edges encoding spatial relationships. A neural network processes this graph, taking two consecutive atmospheric states as input, and outputs the predicted next state 6 hours later. Repeated 40 times, this produces a 10-day forecast.


Q4: Are AI weather models safe for critical decisions like hurricane evacuation?

AI models are increasingly used as guidance for emergency decisions, but national meteorological agencies (NHC, ECMWF, Met Office) remain the official sources for life-safety decisions. AI forecasts are one input among several, and their uncertainty characteristics are still being characterized for extreme events.


Q5: What is ERA5 and why does it matter?

ERA5 is ECMWF's global atmospheric reanalysis—essentially the best historical reconstruction of Earth's atmosphere from 1940 to present, blending all available observations with a physics model. Almost every AI weather model was trained on ERA5 data, making it the foundation of modern AI meteorology.


Q6: Why do AI models sometimes produce blurry forecasts?

Models trained to minimize mean squared error are penalized for sharp features that are in the wrong place, so they learn to hedge with smooth averages. Diffusion-based models like GenCast partially address this by learning the full distribution of atmospheric states rather than a single best estimate.


Q7: Can AI weather models predict climate change?

Current AI weather models are trained on historical data and designed for 0–15 day forecasting. They are not climate models—they cannot project long-term changes in average temperature, sea level, or precipitation under greenhouse gas scenarios. Separate AI research on climate emulators (e.g., ClimaX, ACE) addresses that question, with different training approaches.


Q8: Do AI models improve hurricane intensity forecasts?

Track forecasting has improved significantly. Intensity forecasting (predicting how strong a storm will be) remains harder for AI models, just as it is for NWP. Rapid intensification—sudden large jumps in wind speed over 24 hours—is a known weak point for both AI and traditional models, though AI research specifically targeting this problem is active.


Q9: What is nowcasting and is AI better at it?

Nowcasting refers to very short-range forecasts (0–2 hours) at high spatial resolution, primarily for precipitation. DeepMind's NowcastNet, published in Nature in 2022, outperformed traditional methods in a structured evaluation and is used operationally by the UK Met Office. This is one area where AI's advantage over NWP is most clearly established.


Q10: Which countries are leading in AI weather forecasting?

The United States (via NOAA, NVIDIA, Google, Microsoft), the UK (via DeepMind and the Met Office), China (via Huawei and CMA), and European collaboration (via ECMWF) are the leading actors. China has moved particularly fast on operationalizing Pangu-Weather through CMA. ECMWF spans 35 member and co-operating states and represents the broadest operational deployment.


Q11: Will AI weather forecasting put meteorologists out of work?

No evidence supports this. ECMWF, NOAA, and the Met Office have all explicitly stated that AI models augment rather than replace human forecasters. The role of meteorologists is shifting toward interpreting AI output, managing uncertainty, and communicating risk—skills that AI does not replicate.


Q12: What data do AI weather models need to run?

At inference time, they need a current global atmospheric state—temperature, wind, humidity, and pressure at hundreds of locations and multiple altitude levels. This data comes from the same observation networks (weather balloons, satellites, ocean buoys) that feed traditional NWP, processed through a data assimilation system.


Q13: How accurate is a 10-day AI weather forecast?

At day 10, even the best AI and NWP models show significant uncertainty. GraphCast's anomaly correlation coefficient for 500 hPa geopotential at day 10 is above 0.6 (considered useful skill), compared to ECMWF HRES at roughly 0.55. Beyond day 10, deterministic skill drops below meaningful thresholds for most variables; ensemble forecasts are more useful.


Q14: What is GenCast and how is it different from GraphCast?

GenCast is a diffusion-based probabilistic model from Google DeepMind, published in Nature in 2024. Unlike GraphCast, which produces a single deterministic forecast, GenCast samples from a learned probability distribution to generate multiple forecast members—similar to a traditional ensemble but at much lower cost. It outperformed ECMWF's 51-member ENS on 97.2% of verification targets.


Q15: Is AIFS publicly available?

Yes. ECMWF released AIFS model weights under an open license in 2024. Forecast charts from AIFS are freely available at charts.ecmwf.int. Researchers can download the model and run it with appropriate hardware, subject to ECMWF's terms.


Key Takeaways

  • AI weather models are not hype—they are peer-reviewed, operationally deployed, and outperforming traditional physics-based models on most medium-range metrics.


  • Google DeepMind's GraphCast outperformed ECMWF's best deterministic model on 90% of 1,380 test metrics (Science, December 2023).


  • Huawei's Pangu-Weather ran 10,000× faster than traditional ensemble NWP in its benchmarks (Nature, July 2023).


  • ECMWF moved its AI model (AIFS) to operational status in 2024—the first major meteorological agency to do so.


  • AI's primary weaknesses in 2026 remain extreme precipitation, short-range convection, and performance in climate-shifted conditions outside training data.


  • The future belongs to hybrid models that combine machine learning's pattern recognition with physics-based constraints—not to pure AI replacing NWP wholesale.


  • AI democratizes forecast quality: countries without supercomputing infrastructure can now run global medium-range models on modest hardware.


  • AI weather forecasting augments meteorologists—it does not replace them. The professional role is shifting, not disappearing.


Actionable Next Steps

  1. Access AI forecast products directly: Bookmark ECMWF's public chart viewer (charts.ecmwf.int) to compare AIFS and HRES output side by side. This is free and updated twice daily.


  2. Follow peer-reviewed research: Set up Google Scholar alerts for "AI weather forecasting" and "machine learning NWP." The Bulletin of the American Meteorological Society and Weather and Forecasting are the primary venues for applied results.


  3. Explore the open-weight AIFS model: ECMWF's GitHub (github.com/ecmwf-lab) hosts AIFS weights and documentation. Data scientists with ML experience can fine-tune it for regional applications.


  4. Track WMO's AI initiative: The World Meteorological Organization maintains a dedicated AI in weather page at wmo.int—useful for understanding global operational deployment status.


  5. For energy and agriculture professionals: Evaluate whether AI forecast APIs (Google Weather, The Weather Company) improve your planning models versus your current forecast source. Run a back-test over the past 12 months of operations.


  6. For journalists and communicators: When reporting on weather events, check whether major AI models made distinct or early predictions. Google DeepMind and ECMWF both publish retrospective notes on significant events.


  7. For meteorology students: Review the foundational papers in order: FourCastNet (2022), Pangu-Weather (Nature, 2023), GraphCast (Science, 2023), GenCast (Nature, 2024). Each builds conceptually on the prior. ECMWF's learning portal (learning.ecmwf.int) has free courses on ML for weather.


Glossary

  1. ACC (Anomaly Correlation Coefficient): A standard skill score for weather forecasts. Measures how well a forecast captures deviations from normal conditions. Values above 0.6 are generally considered useful; above 0.8 is very good.

  2. AIFS (Artificial Intelligence/Integrated Forecasting System): ECMWF's operational AI weather model, released to operational status in 2024.

  3. Data assimilation: The process of combining real-time observations with a model background state to estimate the current true state of the atmosphere. The output feeds both NWP and AI forecast models.

  4. Diffusion model: A type of generative machine learning model that learns to produce outputs by reversing a process of adding random noise to data. Applied to weather by GenCast to generate probabilistic forecasts.

  5. ERA5: ECMWF's global atmospheric reanalysis dataset, covering 1940 to present. The primary training source for most AI weather models.

  6. ECMWF (European Centre for Medium-Range Weather Forecasts): An intergovernmental organization and research institution widely regarded as operating the world's best operational weather forecast system.

  7. Ensemble forecast: A set of multiple forecast runs (typically 50–100), each started from slightly different initial conditions, to quantify uncertainty in the forecast.

  8. Graph Neural Network (GNN): A machine learning architecture that processes data structured as a graph—nodes and edges. Used in GraphCast to represent the atmosphere as a network of connected grid points.

  9. HRES: ECMWF's High-Resolution deterministic forecast model—the de facto benchmark for all weather forecast comparisons.

  10. NowcastNet: Google DeepMind's AI model for 0–2 hour precipitation nowcasting, published in Nature in 2022.

  11. NWP (Numerical Weather Prediction): The classical approach to weather forecasting based on solving physics equations numerically on a discretized grid.

  12. RMSE (Root Mean Square Error): The square root of the average squared difference between forecast and observed values. Lower is better.

  13. S2S (Subseasonal to Seasonal): The forecast range between 2 weeks and 6 months—one of the hardest problems in operational meteorology.

  14. Transformer: A machine learning architecture based on self-attention mechanisms, originally developed for natural language processing, now widely applied to weather data (Pangu-Weather, Aurora).

  15. Vision Transformer (ViT): A transformer architecture applied to grid-based (image-like) data. Used in several AI weather models to process atmospheric fields as structured 2D and 3D grids.


Sources & References

  1. Lam, R., et al. "Learning skillful medium-range global weather forecasting." Science, Vol. 382, Issue 6677, December 14, 2023. doi:10.1126/science.adi2336. https://www.science.org/doi/10.1126/science.adi2336

  2. Bi, K., et al. "Accurate medium-range global weather forecasting with 3D neural networks." Nature, Vol. 619, July 5, 2023. doi:10.1038/s41586-023-06185-3. https://www.nature.com/articles/s41586-023-06185-3

  3. Price, I., et al. "GenCast: Diffusion-based ensemble forecasting for medium-range weather." Nature, 2024. doi:10.1038/s41586-024-08252-9. https://www.nature.com/articles/s41586-024-08252-9

  4. Pathak, J., et al. "FourCastNet: A Global Data-driven High-resolution Weather Model using Adaptive Fourier Neural Operators." arXiv:2202.11214, February 2022. https://arxiv.org/abs/2202.11214

  5. Google DeepMind. "GraphCast: AI model for faster and more accurate global weather forecasting." Blog post, September 2023. https://deepmind.google/discover/blog/graphcast-ai-model-for-faster-and-more-accurate-global-weather-forecasting/

  6. ECMWF. "AIFS — ECMWF's Machine Learning Model." 2024. https://www.ecmwf.int/en/research/projects/aifs

  7. ECMWF. "Evaluation of AIFS cycle 3." Technical Memorandum No. 924, 2024. https://www.ecmwf.int

  8. Chen, J., et al. "Evaluation of the Pangu-Weather model for tropical cyclone track prediction." Weather and Forecasting, American Meteorological Society, Vol. 39, No. 4, 2024. https://journals.ametsoc.org

  9. Bouallegue, Z. B., et al. "The rise of data-driven weather forecasting." Bulletin of the American Meteorological Society, 2024. https://journals.ametsoc.org/view/journals/bams/105/3/BAMS-D-23-0162.1.xml

  10. Hamill, T. M., et al. "Verification of AI-based global weather model precipitation forecasts." Geophysical Research Letters, 2024. https://agupubs.onlinelibrary.wiley.com/journal/19448007

  11. Ravuri, S., et al. "Skilful precipitation nowcasting using deep generative models of radar." Nature, Vol. 597, September 2021. doi:10.1038/s41586-021-03854-z. https://www.nature.com/articles/s41586-021-03854-z

  12. IRENA. "Value of Renewable Energy Forecasting." International Renewable Energy Agency, 2024. https://www.irena.org

  13. NOAA. "NOAA AI Strategic Plan 2025–2030." National Oceanic and Atmospheric Administration, 2024. https://www.noaa.gov

  14. WMO. "State of the Global Climate 2024." World Meteorological Organization, 2024. https://public.wmo.int

  15. Weyn, J. A., et al. "Improving data-driven global weather prediction using deep convolutional neural networks on a cubed sphere." Journal of Advances in Modeling Earth Systems, 2020. doi:10.1029/2020MS002109. https://agupubs.onlinelibrary.wiley.com




 
 
 

Comments


bottom of page