Multi-scale spatial fusion — joining 30 m, 70 m, 250 m, and 1 km observations
NASA's Earth-observation portfolio spans **3 orders of magnitude in spatial resolution**: ECOSTRESS at 70 m, MODIS at 1 km, HLS at 30 m, GEDI footprints at 25 m, NISAR at 7 m, GRACE-FO at 300 km. Different missions serve different questions because their resolution trades against revisit, swath, and signal-to-noise. Combining them isn't optional — it's how most useful science gets done. But: **you can't just stack a 30 m raster and a 1 km raster and pixel-multiply.** Different things happen at different scales. This recipe is the safe pattern.
Multi-scale spatial fusion
NASA’s Earth-observation portfolio spans 3 orders of magnitude in spatial resolution: ECOSTRESS at 70 m, MODIS at 1 km, HLS at 30 m, GEDI footprints at 25 m, NISAR at 7 m, GRACE-FO at 300 km. Different missions serve different questions because their resolution trades against revisit, swath, and signal-to-noise. Combining them isn’t optional — it’s how most useful science gets done.
But: you can’t just stack a 30 m raster and a 1 km raster and pixel-multiply. Different things happen at different scales. This recipe is the safe pattern.
The pattern — choose ONE of these three approaches
Approach A — Aggregate fine to coarse (most common, safest)
Resample the finer-resolution layer up to the coarser resolution by area-weighted averaging (or another aggregation that respects the variable’s physical meaning). The coarse grid becomes the analysis grid.
When to use: your science question is fundamentally at the coarser scale (e.g., basin-level water budget, regional carbon flux). Information loss is acceptable; alignment is clean.
Approach B — Downscale coarse to fine (use a model, not interpolation)
Use a relationship between the coarse variable and ancillary fine-resolution covariates (DEM, NDVI, land cover) to predict the fine-resolution version. Random Forest, regression Kriging, or a foundation-model embedding can all work as the predictor.
When to use: your science question is at the fine scale and you can’t accept averaging away the fine signal. SMAP 36 km → 9 km enhanced uses this idea (Backus-Gilbert inversion + ancillary). MODIS NDVI 250 m → Landsat 30 m via NDVI-conserving disaggregation is similar.
Warning: every downscaled pixel is now model-dependent. Don’t claim point-scale measurements from downscaled products.
Approach C — Vector / point representation (don’t resample at all)
Treat the higher-resolution data as a sparse vector layer (GEDI footprints, ICESat-2 tracks, OCO-2 sounding points). Sample the coarser raster at the vector points. Analysis happens in tabular form, not on a grid.
When to use: the fine-resolution data is naturally sparse (lidar footprints) or non-grid (orbital tracks). Common with GEDI + HLS + SMAP joins.
Minimal worked example — joining HLS 30 m + MODIS 1 km + ECOSTRESS 70 m at a 1 km common grid
import earthaccess
import xarray as xr
import rioxarray
from rasterio.enums import Resampling
earthaccess.login(strategy="netrc")
aoi = (-122, 38, -120, 40)
window = ("2024-07-01", "2024-09-30")
# 1. HLS NDVI (30m source)
hls = earthaccess.search_data(short_name="HLSL30", bounding_box=aoi,
temporal=window, cloud_cover=20)
# Open + compute NDVI per scene → mean over period → 30m raster
# 2. MODIS LST (1 km — pick this as the common grid)
mod_lst = earthaccess.search_data(short_name="MOD11A1", bounding_box=aoi,
temporal=window)
# Open + extract LST_Day_1km; mean over period → 1km raster
# 3. ECOSTRESS LST (70m)
eco = earthaccess.search_data(short_name="ECO2LSTE", bounding_box=aoi,
temporal=window)
# Open + extract LST_70m; mean over period → 70m raster
# 4. Aggregate HLS NDVI 30m → 1km (Approach A — area-weighted average)
hls_1km = hls_ndvi.rio.reproject_match(
mod_lst, resampling=Resampling.average
)
# 5. Aggregate ECOSTRESS 70m → 1km
eco_1km = eco_lst.rio.reproject_match(
mod_lst, resampling=Resampling.average
)
# 6. Stack on common 1km grid
ds = xr.Dataset({
"ndvi": hls_1km,
"lst_modis_day": mod_lst.mean("time"),
"lst_ecostress": eco_1km.mean("time"),
})
# 7. Cross-sensor analysis (now safe — same grid, same projection)
# e.g., correlate ECOSTRESS daytime LST with HLS NDVI per pixel
Common gotchas
- Reprojection without matching CRSs first drops to nearest-neighbor interpolation in some libraries — silently. Always check
da.rio.crsand explicitly reproject before resampling. - Resampling.average on a categorical (land-cover) raster is meaningless. Use
Resampling.modeor treat per-class. - Temporal alignment matters too. HLS is 16-day revisit; MODIS daily; ECOSTRESS variable. Aggregate to a common window (typically monthly) before spatial fusion.
- Edge effects at AOI boundaries are real — load 1 extra coarse pixel beyond your AOI for clean reprojection.
- rioxarray vs xarray-spatial vs rasterio-direct all have slightly different reprojection defaults. Stick with one.
When this pattern fails
- When the variable doesn’t aggregate linearly. Water vapor column averaged over 1 km is fine; cloud fraction averaged over 1 km is not (it’s a non-linear transform from cloudy pixels to “cloudiness”). Think before averaging.
- When sensor footprints overlap pixel boundaries (GEDI footprints over HLS pixels). Treat as sparse vector (Approach C).
- When the physical scale of the process matches one of the resolutions but not others. Don’t average ECOSTRESS 70 m field-scale anomalies to 1 km — the signal disappears.
Sources
- rioxarray docs: https://corteva.github.io/rioxarray/stable/
- Resampling methods: https://rasterio.readthedocs.io/en/stable/topics/resampling.html
- Pangeo workflows for multi-sensor stacking: https://pangeo.io/
The steps, code, and sources below are kept in the original English for technical accuracy.