r02·advanced

Foundation model on NASA data — from Hugging Face checkpoint to working inference

Hugging FaceLP DAAC (HLS)GES DISC (SDO)NASA-IMPACT models

Model cards exist on Hugging Face, but the “open a notebook, ask a question, get a fine-tuned inference” path is missing. The pretrained weights are open but not turn-key for prediction — this pattern fills that gap.

Foundation model on NASA data

“Model cards on Hugging Face exist but the ‘open a notebook, ask a question, get a fine-tuned inference’ path is missing.” — Research Agent D, Pattern C (FM downstream gap)

Three things you need to know before you start:

  1. The pretrained weights are open but not turn-key for prediction.
  2. Inference needs the same preprocessing pipeline used at training time — and that pipeline is usually only partially documented.
  3. Fine-tuning is the realistic path for most science applications; pure zero-shot is rarely good enough.

This recipe is the canonical workflow for getting from pip install to a usable downstream task on NASA EO data.

The pattern in 5 steps

  1. Pick the model that matches your data modality. Prithvi-EO 2.0 for HLS-shaped multispectral land imagery; Surya for SDO heliophysics; Prithvi-WxC for weather/climate; Prithvi-HLS task heads for burn-scar / crop-classification / flood. Don’t mix modalities.
  2. Read the model card’s preprocessing notes carefully. Channel order, normalization stats, patch sizes, and temporal stacking conventions vary by model. Most “it doesn’t work” is preprocessing drift.
  3. Validate the pipeline with the published demo first. Each model has a Hugging Face demo notebook — run it on the bundled tiny sample before plugging in your own data. If the demo fails, fix that before going further.
  4. Sub-sample your AOI heavily for the first fine-tune. 100 patches, not 10,000. 5 epochs, not 50. You’re checking the pipeline works, not getting publishable accuracy yet.
  5. Cache aggressively. Pretrained-weight downloads are GB-scale; preprocessed-patch generation is slow; both deserve disk cache.

Minimal worked example — Prithvi-EO 2.0 on HLS for burn-scar segmentation

import earthaccess
import torch
import xarray as xr
from huggingface_hub import hf_hub_download
from transformers import AutoModelForSemanticSegmentation

earthaccess.login(strategy="netrc")

# 1. Download pretrained Prithvi-EO 2.0
model_id = "ibm-nasa-geospatial/Prithvi-EO-2.0-300M"
ckpt = hf_hub_download(repo_id=model_id, filename="Prithvi_EO_V2_300M.pt")
# (See model card for exact filename + repo)

# 2. Pull HLS L30 for the AOI + dates
aoi = (-122, 38, -120, 40)  # northern California
window = ("2024-07-15", "2024-09-15")  # 2024 wildfire season
hls = earthaccess.search_data(short_name="HLSL30",
                               bounding_box=aoi, temporal=window,
                               cloud_cover=20)

# 3. Stack bands in the order Prithvi expects (per model card):
#    Coastal, Blue, Green, Red, NIR Narrow, SWIR 1, SWIR 2 — typically 6 bands
# Crop to 224x224 patches (the input size)

# 4. Apply normalization from the model card
#    (mean + std vectors are provided in the repo)

# 5. Load + run
model = AutoModelForSemanticSegmentation.from_pretrained(model_id)
model.eval()
with torch.no_grad():
    output = model(patches_tensor)
# Output is logits per class — argmax for burn / not-burn

# 6. (Optional) Fine-tune on labeled burn polygons from MCD64A1
#    Use ~100 labeled patches first to validate the pipeline.

Common gotchas

  • Band order matters. The Prithvi family expects HLS in a specific channel ordering. Use the model card’s preprocessing script, not your own intuition.
  • Spectral normalization stats are mission-specific. Don’t reuse the ImageNet normalization values.
  • Patch boundaries leak in semantic segmentation. Always use overlapping tiles + Gaussian-weighted merging at the seams, or accept block artifacts.
  • AutoModel* may not load the NASA-IBM weights cleanly depending on which Hugging Face Transformers version the repo was uploaded against. Check the model card for the exact loading recipe; you may need safetensors + custom config.
  • Surya is huge. 366M parameters, 218 TB of training data on the back end. Inference needs GPU; training needs multi-GPU. Don’t try CPU.
  • The “demo notebook” is the ground truth for preprocessing. If your output looks wrong, diff your code against the demo cell-by-cell.

When this pattern fails (and what to do)

  • If your AOI is outside HLS coverage (e.g., polar regions), Prithvi-EO 2.0 isn’t trained on the right distribution — expect degradation.
  • If your input bands differ from the model’s expected channels (e.g., you only have Landsat 8 vs HLS-harmonized), retrain the input projection layer or use a different model.
  • If you can’t fit a meaningful AOI in GPU memory, tile + merge instead of forcing a giant single-pass inference.

Sources + further reading

The steps, code, and sources below are kept in the original English for technical accuracy.