Anomaly vs climatology — comparing this year to the long-term normal
The single most reusable analytical pattern in Earth-science remote sensing. **Almost every "is this unusual?" question is a difference between the current observation and the long-term seasonal cycle.** Get the pattern right once and reuse for: rainfall anomalies, marine heatwaves, soil-moisture droughts, vegetation greenness, sea-ice extent, total water storage, NO₂ pollution trends. Same shape every time.
Anomaly vs climatology
The single most reusable analytical pattern in Earth-science remote sensing. Almost every “is this unusual?” question is a difference between the current observation and the long-term seasonal cycle.
Get the pattern right once and reuse for: rainfall anomalies, marine heatwaves, soil-moisture droughts, vegetation greenness, sea-ice extent, total water storage, NO₂ pollution trends. Same shape every time.
The pattern in 4 steps
- Define the climatology window. Conventional: 1981–2010 (WMO standard), 1991–2020 (newer WMO), or the full available record minus the last 3 years (to avoid the current event biasing the baseline). State your choice explicitly.
- Aggregate to the time grain of interest. Daily, monthly, yearly. The right choice depends on the variable’s natural autocorrelation: daily for fast-moving (precipitation, NO₂), monthly for medium (soil moisture, SST), yearly for slow (ice mass, total water storage).
- Compute per-pixel climatology = mean (or median) of the variable across all years in the baseline window for each day-of-year (or month-of-year). Standard deviation too if you want anomaly-Z-scores.
- Anomaly = observation − climatology. Optionally normalize: percentage of normal, standardized anomaly (Z-score), or percentile rank within the climatology distribution.
Minimal worked example — rainfall anomaly for a region
import earthaccess
import xarray as xr
import numpy as np
earthaccess.login(strategy="netrc")
aoi = (74, 18, 78, 21) # Maharashtra
clim_window = ("2001-01-01", "2020-12-31") # 20-yr baseline
analysis_year = 2025
# 1. Pull all baseline data
baseline = earthaccess.search_data(short_name="GPM_3IMERGM",
bounding_box=aoi,
temporal=clim_window)
ds_base = xr.open_mfdataset([earthaccess.open([g])[0] for g in baseline])
# precipitation: (time, lat, lon)
# 2. Compute monthly climatology per pixel
climatology = ds_base.precipitation.groupby("time.month").mean("time")
clim_std = ds_base.precipitation.groupby("time.month").std("time")
# 3. Pull this year's data
current = earthaccess.search_data(short_name="GPM_3IMERGM",
bounding_box=aoi,
temporal=(f"{analysis_year}-01-01",
f"{analysis_year}-12-31"))
ds_cur = xr.open_mfdataset([earthaccess.open([g])[0] for g in current])
# 4. Anomaly variants
# Absolute: this minus climatology in mm/month
absolute_anom = ds_cur.precipitation.groupby("time.month") - climatology
# Percent of normal: 100 × current / climatology
percent_normal = (ds_cur.precipitation.groupby("time.month") / climatology) * 100
# Z-score: (current - climatology) / std
z_score = (ds_cur.precipitation.groupby("time.month") - climatology) / clim_std
# 5. Plot any of the three; Z-score is best for cross-region comparison.
Variants for other variables
| Variable | Short name | Cadence | Climatology window |
|---|---|---|---|
| Rainfall | GPM_3IMERGM | monthly | 2001-2020 (~20 yr IMERG) |
| Sea-surface temperature | MUR-JPL-L4-GLOB-v4.1 | daily/monthly | 2002-2020 (~20 yr MUR) |
| Soil moisture (surface) | SPL3SMP_E | daily | 2015-2024 (~10 yr, short!) |
| Soil moisture (root zone) | SPL4SMAU | daily/3-hr | 2015-2024 |
| Total water storage | GRACEFO_L3_* | monthly | 2004-2018 (avoid trend) |
| Vegetation greenness | MOD13Q1 | 16-day | 2001-2020 |
| Sea ice extent | NSIDC G02135 | daily | 1991-2020 (WMO standard) |
Common gotchas
- Don’t include the current event in your baseline. If you’re analyzing 2024-25 drought, don’t compute climatology over 2002-2024 — that biases the baseline toward dry. Stop the climatology at 2022 or earlier.
- Beware of trend embedded in the climatology. A multi-decade SST or sea-ice record has a real climate trend; computing “anomalies vs the mean” attributes some of the trend to “noise.” Detrend first if you’re studying a specific event.
- Per-pixel climatology is data-hungry. With only 5 years of SMAP, your 9 km daily climatology has only 5 samples per day-of-year — very noisy. Aggregate spatially or temporally before computing climatology.
- Z-scores assume normality. Precipitation, snow, and similar variables are heavily skewed (zeros + heavy tail). Percentiles or gamma-distribution-fitted indices (SPI, SPEI) are more honest for precipitation.
- Land-mask and water-mask boundaries can produce artifacts (zero-or-nan flips per pixel). Always inspect a map before quoting numbers.
When this pattern fails
- For step-change events (a new dam, a land-use conversion, a sensor reprocessing) — the climatology has structural breaks. Either segment by epoch or use a counterfactual model.
- For sparse data (lightning strikes, EMIT plume detections) — averaging zeros produces near-zero climatologies that any event exceeds. Use Poisson statistics or rate-based comparisons.
- For trend attribution. This pattern shows “this is unusual”; it doesn’t show “this was caused by climate change.” Causal attribution needs counterfactual model runs.
Sources
- WMO Climatological Standard Normals: https://community.wmo.int/en/wmo-climatological-normals
- Xarray groupby docs: https://docs.xarray.dev/en/stable/user-guide/groupby.html
- SPI / SPEI for precipitation: https://drought.unl.edu/ranchplan/DroughtBasics/WeatherDrought/StandardizedPrecipitationIndex.aspx
The steps, code, and sources below are kept in the original English for technical accuracy.