q25·intermediate

Is the lake or river my town drinks from getting choked with toxic algae?

water-resourcespublic-healthhydrologybiosphere Datasets: 3 20–60 min
Find the data for your area

Draw a rectangle to pick your area of interest, then see what NASA data covers it (live, here in your browser) or download a ready-to-run notebook with your AOI pre-filled. The notebook runs in any Python environment — it needs a free Earthdata Login to fetch the data.

Current AOI: -83.5, 41.4 → -82.5, 42 (Western Lake Erie (Toledo drinking-water intake))

Is the lake or river my town drinks from getting choked with toxic algae?

Cyanobacteria (blue-green algae) blooms in inland reservoirs and lakes can poison a city’s drinking water and shut down fishing, often with little to no on-the-ground monitoring. The CyAN program turns Sentinel-3 OLCI into a cyanobacteria index for the specific waterbody your town drinks from — letting you watch a bloom build before it reaches the intake pipe.

This is the freshwater / drinking-supply companion to the coastal-HAB question (q14). It is about cyanobacteria in lakes and reservoirs, not marine algal blooms in coastal seawater.

What you can answer

  • Is a bloom forming on my source water right now? — The CyAN cyanobacteria index (CI) maps cyanobacteria abundance pixel-by-pixel over the lake or reservoir.
  • How severe and how big is it? — Translate the index into low/moderate/high categories and measure the bloom area (km²) over the waterbody.
  • Is it near the drinking-water intake? — Sample the index at the intake location and its surrounding pixels, not just the lake-wide average.
  • When does it usually peak? — Build a seasonal climatology (2016+) so you know your normal bloom window and can flag an early or unusually intense season.
  • Is it getting worse year over year? — Track peak bloom area and intensity across the Sentinel-3 record to see a trend.

What you can NOT answer with these datasets alone

  • Lakes outside the United States — the merged CyAN product is gridded to a CONUS-only 300 m raster (US lakes and reservoirs). For lakes elsewhere, use the OLCI inland-water Level-2 scenes (OLCIS3A_L2_ILW / OLCIS3B_L2_ILW) and derive the index yourself.
  • Toxin concentration (microcystin, cylindrospermopsin) — the index tracks cyanobacteria abundance, not toxin level. Toxins require water sampling and lab assay (ELISA / LC-MS).
  • Is the treated tap water safe? — Satellites see the raw source water, not what comes out after treatment. Combine with utility finished-water sampling.
  • Small ponds, narrow rivers, or near-shore strips — OLCI is ~300 m resolution; pixels contaminated by land/shoreline are masked. Many small reservoirs are too small to resolve.
  • Sub-surface or bottom-hugging blooms — satellites see only the surface optical layer; a bloom mixed deep or sitting on the bottom can be invisible.
  • Species identity — the index flags cyanobacteria-like signals, not which species or whether it is a toxin-producing strain. Confirm in the lab.

Code template (Python, cloud-direct)

Verified locally. The merged CyAN product ships as Cloud-Optimized GeoTIFFs (one 7-day composite per file), gridded to the CONUS Albers projection (EPSG:5070) at 300 m — so it is US lakes only, and you read it with rioxarray, not xarray. Each pixel is a digital number 0–255: 0 = below detection, 1–253 = increasing cyanobacteria index, 254 = land, 255 = no data.

import earthaccess
import rioxarray
import numpy as np
import pandas as pd
from rasterio.warp import transform_bounds

earthaccess.login(strategy="netrc")

# Western Lake Erie — Toledo drinking-water intake region (lon/lat)
aoi = (-83.5, 41.4, -82.5, 42.0)
intake_lon, intake_lat = -83.26, 41.69   # approx. Toledo Collins Park WTP crib intake

# 1. Merged Sentinel-3 OLCI cyanobacteria index (CyAN), 7-day composites, 2016+
results = earthaccess.search_data(
    short_name="MERGED_S3_OLCI_L3m_CYAN",
    bounding_box=aoi,
    temporal=("2016-04-01", "2025-10-31"),   # full record; bloom season is summer
)
files = earthaccess.open(results)

# 2. Build a bloom-severity time series over the lake AOI
records = []
for f in files:
    da = rioxarray.open_rasterio(f, masked=False).squeeze()      # DN raster, EPSG:5070
    # reproject the lon/lat AOI into the raster's CRS, then window
    xmin, ymin, xmax, ymax = transform_bounds("EPSG:4326", da.rio.crs, *aoi)
    win = da.rio.clip_box(xmin, ymin, xmax, ymax).values.astype("float32")

    cyano = (win >= 1) & (win <= 253)             # real water pixels with a reading
    dn = np.where(cyano, win, np.nan)             # DN where cyano detectable, else NaN

    # value at the intake pixel
    intake = da.rio.reproject("EPSG:4326").sel(
        x=intake_lon, y=intake_lat, method="nearest").values
    intake_dn = float(intake) if 1 <= intake <= 253 else np.nan

    records.append({
        "date":          pd.to_datetime(f.granule.get("time_start", None), errors="coerce"),
        "mean_dn":       float(np.nanmean(dn)) if cyano.any() else 0.0,
        "max_dn":        float(np.nanmax(dn))  if cyano.any() else 0.0,
        "bloom_area_px": int(cyano.sum()),       # any detectable cyano
        "high_area_px":  int((win >= 200).sum()),# high end of the index
        "intake_dn":     intake_dn,
    })

ts = pd.DataFrame(records).dropna(subset=["date"]).sort_values("date")
ts["bloom_area_km2"] = ts["bloom_area_px"] * (0.3 * 0.3)   # ~0.09 km² per 300 m pixel

# 3. Seasonal climatology + intake alerting
ts["intake_alert"] = ts["intake_dn"] >= 200               # high index at the intake → flag
print(ts.groupby(ts["date"].dt.month)["bloom_area_km2"].mean())
print(ts.loc[ts["intake_alert"], ["date", "intake_dn", "bloom_area_km2"]])

# To turn DN into the published cyanobacteria index / cell density, apply the
# conversion in the CyAN product spec (linked in Sources) — DN is a relative index here.

Expected output

  • A bloom-severity time series for your source water: lake-wide bloom area (km²) and the cyanobacteria index sampled at the drinking-water intake, across multiple bloom seasons.
  • A seasonal climatology showing the normal bloom window (for western Lake Erie, late July–September) so an early or oversized season stands out.
  • An intake alert flag — dates when high-severity cyanobacteria sat near the intake, the events a utility most needs to know about.
  • A map of the cyanobacteria index over the lake for any flagged date, showing where the bloom is concentrated relative to the intake.

Caveats

  • Index ≠ toxin. A high cyanobacteria index means “sample the water now,” not “the toxin level is X.” Pair every satellite alert with utility sampling.
  • Clouds and wind. Cloud cover blanks out scenes; wind can pile a bloom against one shore in hours. Treat each clear scene as a snapshot, not a continuous record.
  • Resolution limits. ~300 m pixels and shoreline masking mean small reservoirs and narrow river reaches are poorly resolved or excluded entirely.
  • Mixed pixels and Case-II water. Sediment and dissolved organics in turbid inland water can confuse retrievals; the CyAN index is tuned for cyanobacteria but is not infallible.

Cross-DAAC composition

OB.DAAC only — the merged CyAN product and OLCI inland-water L2 are both distributed by the NASA Ocean Biology DAAC, served through the CyAN project for inland waters.

Sources

📚 Problem Finder KB

Not yet tracked in the KB.