g12·concept
NetCDF & HDF
NetCDF & HDF
The legacy scientific data formats most NASA Earth data has used for decades. Self-describing containers for multi-dimensional arrays + metadata.
Why it matters
Powerful and well-supported, but designed for download-and-open workflows, not cloud streaming. Much of the cloud-migration effort is making NetCDF/HDF cloud-friendly (via COG, Zarr, or Kerchunk).
Where you’ll meet it
- Nearly every granule you download via
earthaccesslands as a.nc(NetCDF4) or.hdf/.h5file — MODIS, SMAP, GEDI, and MERRA-2 all ship this way. - Opening one in Python almost always means
xarray.open_dataset(), which reads both NetCDF4 and HDF5 (NetCDF4 is in fact built on HDF5 under the hood). - When a collection feels slow to stream from the cloud, it’s usually because it’s classic NetCDF/HDF — that’s exactly what Kerchunk/VirtualiZarr and Zarr conversions aim to fix.
In plain terms
Like a well-organized filing cabinet — excellent if it’s in your office, awkward if you have to ship the whole cabinet to read one folder.