g12·concept

NetCDF & HDF

NetCDF & HDF

The legacy scientific data formats most NASA Earth data has used for decades. Self-describing containers for multi-dimensional arrays + metadata.

Why it matters

Powerful and well-supported, but designed for download-and-open workflows, not cloud streaming. Much of the cloud-migration effort is making NetCDF/HDF cloud-friendly (via COG, Zarr, or Kerchunk).

Where you’ll meet it

  • Nearly every granule you download via earthaccess lands as a .nc (NetCDF4) or .hdf/.h5 file — MODIS, SMAP, GEDI, and MERRA-2 all ship this way.
  • Opening one in Python almost always means xarray.open_dataset(), which reads both NetCDF4 and HDF5 (NetCDF4 is in fact built on HDF5 under the hood).
  • When a collection feels slow to stream from the cloud, it’s usually because it’s classic NetCDF/HDF — that’s exactly what Kerchunk/VirtualiZarr and Zarr conversions aim to fix.

In plain terms

Like a well-organized filing cabinet — excellent if it’s in your office, awkward if you have to ship the whole cabinet to read one folder.