r04·beginner-intermediate

Anomaly vs climatology — comparing this year to the long-term normal

GES DISC (GPMMERRA-2)OB.DAAC (MODIS SST)PO.DAAC (GRACE-FO)NSIDC (SMAP)

The single most reusable analytical pattern in Earth-science remote sensing. **Almost every "is this unusual?" question is a difference between the current observation and the long-term seasonal cycle.** Get the pattern right once and reuse for: rainfall anomalies, marine heatwaves, soil-moisture droughts, vegetation greenness, sea-ice extent, total water storage, NO₂ pollution trends. Same shape every time.

Anomaly vs climatology

The single most reusable analytical pattern in Earth-science remote sensing. Almost every “is this unusual?” question is a difference between the current observation and the long-term seasonal cycle.

Get the pattern right once and reuse for: rainfall anomalies, marine heatwaves, soil-moisture droughts, vegetation greenness, sea-ice extent, total water storage, NO₂ pollution trends. Same shape every time.

The pattern in 4 steps

  1. Define the climatology window. Conventional: 1981–2010 (WMO standard), 1991–2020 (newer WMO), or the full available record minus the last 3 years (to avoid the current event biasing the baseline). State your choice explicitly.
  2. Aggregate to the time grain of interest. Daily, monthly, yearly. The right choice depends on the variable’s natural autocorrelation: daily for fast-moving (precipitation, NO₂), monthly for medium (soil moisture, SST), yearly for slow (ice mass, total water storage).
  3. Compute per-pixel climatology = mean (or median) of the variable across all years in the baseline window for each day-of-year (or month-of-year). Standard deviation too if you want anomaly-Z-scores.
  4. Anomaly = observation − climatology. Optionally normalize: percentage of normal, standardized anomaly (Z-score), or percentile rank within the climatology distribution.

Minimal worked example — rainfall anomaly for a region

import earthaccess
import xarray as xr
import numpy as np

earthaccess.login(strategy="netrc")

aoi = (74, 18, 78, 21)  # Maharashtra
clim_window = ("2001-01-01", "2020-12-31")  # 20-yr baseline
analysis_year = 2025

# 1. Pull all baseline data
baseline = earthaccess.search_data(short_name="GPM_3IMERGM",
                                   bounding_box=aoi,
                                   temporal=clim_window)
ds_base = xr.open_mfdataset([earthaccess.open([g])[0] for g in baseline])
# precipitation: (time, lat, lon)

# 2. Compute monthly climatology per pixel
climatology = ds_base.precipitation.groupby("time.month").mean("time")
clim_std = ds_base.precipitation.groupby("time.month").std("time")

# 3. Pull this year's data
current = earthaccess.search_data(short_name="GPM_3IMERGM",
                                   bounding_box=aoi,
                                   temporal=(f"{analysis_year}-01-01",
                                             f"{analysis_year}-12-31"))
ds_cur = xr.open_mfdataset([earthaccess.open([g])[0] for g in current])

# 4. Anomaly variants
#    Absolute: this minus climatology in mm/month
absolute_anom = ds_cur.precipitation.groupby("time.month") - climatology

#    Percent of normal: 100 × current / climatology
percent_normal = (ds_cur.precipitation.groupby("time.month") / climatology) * 100

#    Z-score: (current - climatology) / std
z_score = (ds_cur.precipitation.groupby("time.month") - climatology) / clim_std

# 5. Plot any of the three; Z-score is best for cross-region comparison.

Variants for other variables

VariableShort nameCadenceClimatology window
RainfallGPM_3IMERGMmonthly2001-2020 (~20 yr IMERG)
Sea-surface temperatureMUR-JPL-L4-GLOB-v4.1daily/monthly2002-2020 (~20 yr MUR)
Soil moisture (surface)SPL3SMP_Edaily2015-2024 (~10 yr, short!)
Soil moisture (root zone)SPL4SMAUdaily/3-hr2015-2024
Total water storageGRACEFO_L3_*monthly2004-2018 (avoid trend)
Vegetation greennessMOD13Q116-day2001-2020
Sea ice extentNSIDC G02135daily1991-2020 (WMO standard)

Common gotchas

  • Don’t include the current event in your baseline. If you’re analyzing 2024-25 drought, don’t compute climatology over 2002-2024 — that biases the baseline toward dry. Stop the climatology at 2022 or earlier.
  • Beware of trend embedded in the climatology. A multi-decade SST or sea-ice record has a real climate trend; computing “anomalies vs the mean” attributes some of the trend to “noise.” Detrend first if you’re studying a specific event.
  • Per-pixel climatology is data-hungry. With only 5 years of SMAP, your 9 km daily climatology has only 5 samples per day-of-year — very noisy. Aggregate spatially or temporally before computing climatology.
  • Z-scores assume normality. Precipitation, snow, and similar variables are heavily skewed (zeros + heavy tail). Percentiles or gamma-distribution-fitted indices (SPI, SPEI) are more honest for precipitation.
  • Land-mask and water-mask boundaries can produce artifacts (zero-or-nan flips per pixel). Always inspect a map before quoting numbers.

When this pattern fails

  • For step-change events (a new dam, a land-use conversion, a sensor reprocessing) — the climatology has structural breaks. Either segment by epoch or use a counterfactual model.
  • For sparse data (lightning strikes, EMIT plume detections) — averaging zeros produces near-zero climatologies that any event exceeds. Use Poisson statistics or rate-based comparisons.
  • For trend attribution. This pattern shows “this is unusual”; it doesn’t show “this was caused by climate change.” Causal attribution needs counterfactual model runs.

Sources

The steps, code, and sources below are kept in the original English for technical accuracy.