dataretrieval.wateruse
Retrieve USGS water-use data from the National Water Availability Assessment Data Companion (NWDC).
The NWDC web services provide national-scale, USGS-modeled water-use data that
underlie the USGS National Water Availability Assessment. Estimates are served on a HUC12
(12-digit hydrologic unit) spatial grid and can be queried for any county,
state, or hydrologic unit. This is the modern replacement for the defunct
legacy NWIS water-use service (nwis.get_water_use).
Unlike the main Water Data getters (dataretrieval.waterdata) and NGWMN
(dataretrieval.ngwmn), the NWDC is a plain CSV REST service rather than
an OGC API Features collection. This module supplies the NWDC-specific bits —
request building, CSV parsing, the Link-header cursor, and the {detail}
error envelope — but reuses the OGC engine’s generic, API-agnostic pagination
and sync-from-async plumbing (_paginate() and
_run_sync()) rather than re-implementing it. It
follows the same conventions: shared request headers
(_default_headers()), the typed
DataRetrievalError taxonomy, and a
(DataFrame, BaseMetadata) return.
See https://api.water.usgs.gov/docs/nwaa-data/ for the API reference and https://water.usgs.gov/nwaa-data/ for the catalog of available models and variables.
Examples
from dataretrieval import wateruse
# Monthly public-supply withdrawals for Rhode Island, 2020 onward.
df, md = wateruse.get_wateruse(
model="wu-public-supply-wd",
variable=["pswdtot", "pswdgw", "pswdsw"],
state="RI",
start_date="2020-01",
time_resolution="monthly",
)
- dataretrieval.wateruse.MAX_CONCURRENT_REQUESTS = 4
Maximum locations fetched concurrently when a list of state/county/huc selectors is fanned out (one request per location). Kept conservative because this module intentionally carries no request backoff/retry; the NWDC tolerates this level of concurrency without rate-limit errors (verified by stress test). Set
wateruse.MAX_CONCURRENT_REQUESTS = 1for serial.
- dataretrieval.wateruse.MODELS = ('wu-public-supply-wd', 'wu-public-supply-cu', 'wu-thermoelectric', 'wu-irrigation-wd', 'wu-irrigation-cu')
Water-use models (categories) served by the NWDC. The catalog at https://water.usgs.gov/nwaa-data/ lists the variables available within each.
- dataretrieval.wateruse.TIME_RESOLUTIONS = ('monthly', 'annualcy', 'annualwy')
monthly, annual calendar year, annual water year.
- Type:
Temporal resolutions
- dataretrieval.wateruse._as_list(value: object) list[Any][source]
A scalar becomes a one-element list; any non-string iterable (list, tuple, Series, ndarray, generator) is materialized to a list. A string is treated as a scalar so it isn’t exploded into characters.
- async dataretrieval.wateruse._fan_out(requests: list[Request], headers: dict[str, str], ssl_check: bool) tuple[DataFrame, Response][source]
Fetch every request (each paginated) concurrently over one shared client.
Each request is paginated by the engine’s
_paginate()with NWDC strategies: parse a CSV page and read itsLinkheader cursor (parse), follow that cursor (follow), and raise the typed error carrying the NWDCdetail(raise_for_status). Concurrency is bounded by a semaphore atMAX_CONCURRENT_REQUESTS, andasyncio.gatherpreserves input order, so the concatenation is deterministic. The sharedhttpx.AsyncClientkeeps connections alive across pages and requests.
- dataretrieval.wateruse._next_page_url(response: Response) str | None[source]
Return the absolute URL of the next page, or None if this is the last.
Reads the standard
Link: <...>; rel="next"header (parsed by httpx intoresponse.links). A next link served against the barewater.usgs.govhost is normalized to the publicapi.water.usgs.govgateway so the follow-up request reaches the API.
- dataretrieval.wateruse._nwdc_error_detail(response: Response) str | None[source]
Pull the
detailmessage out of an NWDC JSON error envelope, if any.The NWDC reports errors as
{"detail": "Invalid model name: ..."}. Passed to_raise_for_status()asdetail_fromso the service’s wording surfaces in the typed error message.
- dataretrieval.wateruse._read_csv_page(response: Response) DataFrame[source]
Parse one CSV page;
huc12_idstays a string to keep leading zeros.
- dataretrieval.wateruse._resolve_locations(state: str | int | Iterable[str | int] | None, county: str | Iterable[str] | None, huc: str | Iterable[str] | None) list[str][source]
Build the NWDC
location=<type>:<id>value(s) from the selectors.Exactly one of
state/county/hucmust be given; each may be a single value or a list.stateis normalized to the two-letter postal codestateCdrequires;countyis a five-digit FIPS code; and ahuccode’s length selects its level (huc2…huc12). Returns one location string per value — the caller issues one request per location.
- dataretrieval.wateruse._validate_county(value: object) str[source]
Validate and normalize a five-digit state+county FIPS code.
- dataretrieval.wateruse._validate_huc(value: object) str[source]
Validate a HUC code (even length 2-12 digits; level set by length).
- dataretrieval.wateruse.get_wateruse(model: str, variable: str | Iterable[str] | None = None, state: str | int | Iterable[str | int] | None = None, county: str | Iterable[str] | None = None, huc: str | Iterable[str] | None = None, time_resolution: str | None = None, start_date: str | None = None, end_date: str | None = None, intersection: str = 'overlap', limit: int = 600, ssl_check: bool = True) tuple[DataFrame, BaseMetadata][source]
Get USGS water-use data from the NWDC web service.
Retrieves modeled water-use estimates from the USGS National Water Availability Assessment Data Companion. The area is given as exactly one of
state,county, orhuc; results are always returned on a HUC12 grid, in a long (tidy) frame with one row per HUC12 and time step. Large areas (e.g. a whole region or a populous state) are served across multiple pages, which this function follows transparently and concatenates into one frame.Each selector also accepts a list of values. The NWDC queries one area per request, so a list is fanned out into one request per value — up to
MAX_CONCURRENT_REQUESTSin parallel — and the results are concatenated in the order given.- Parameters:
model (string) – Water-use category to query. See
MODELSfor the available options (e.g."wu-public-supply-wd"). The full catalog of models and their variables is at https://water.usgs.gov/nwaa-data/.variable (string or iterable of strings, optional) – One or more variable IDs within
model(e.g."pswdtot"for total public-supply withdrawals, or["pswdgw", "pswdsw"]for the groundwater and surface-water components). Multiple variables are comma-joined into a single request. The service requires at least one variable; omitting it returns a 400 listing the model’s valid variable IDs (surfaced as aDataRetrievalError).state (string, int, or iterable, optional) – One or more US states/territories to query. Each accepts a full name (
"Wisconsin"), a two-letter postal code ("WI"), or a two-digit ANSI/FIPS code ("55"or55), mirroringdataretrieval.ngwmn.get_sites().county (string or iterable, optional) – One or more five-digit county FIPS codes — state FIPS + county FIPS, e.g.
"55025"for Dane County, Wisconsin.huc (string or iterable, optional) –
One or more hydrologic unit codes. Each code’s level is taken from its length: a 2-digit code queries a HUC2 region, 8-digit a HUC8 subbasin, 12-digit a single HUC12, and so on (even lengths 2-12, e.g.
"04","07070005","010900020502").Provide exactly one of
state,county, orhuc(each may be a single value or a list).time_resolution (string, optional) – Temporal resolution:
"monthly","annualcy"(annual, calendar year), or"annualwy"(annual, water year). SeeTIME_RESOLUTIONS.start_date (string, optional) – Start of the query window, formatted
"YYYY"for annual data or"YYYY-MM"for monthly data.end_date (string, optional) – End of the query window, in the same format as
start_date.intersection (string, optional) – How to select HUC12s that straddle the queried-area boundary:
"overlap"(any overlap, the default) or"envelop"(fully enclosed).limit (int, optional) – Maximum number of HUC12s returned per page. Queries spanning more than
limitHUC12s are split across pages and reassembled. Default 600.ssl_check (bool, optional) – If True (default), verify SSL certificates; set False to skip verification (e.g. behind a TLS-intercepting proxy).
- Returns:
df (
pandas.DataFrame) – Water-use estimates in long form: ahuc12_idcolumn (string, leading zeros preserved), a time column (year_monthfor monthly data oryearfor annual data), and one value column per requested variable (suffixed with its unit, e.g.pswdtot_mgdfor million gallons per day).md (
dataretrieval.utils.BaseMetadata) – Metadata describing the request (URL, query time, response headers).
- Raises:
ValueError – If not exactly one of
state,county, orhucis given, or a given selector is malformed (an unrecognized state, a county code that is not five digits, or a HUC of invalid length).DataRetrievalError – On an HTTP error response, the typed subclass for the status (see
dataretrieval.exceptions.error_for_status()); orNetworkErroron a connection-level failure (timeout, DNS).
Examples
>>> from dataretrieval import wateruse >>> df, md = wateruse.get_wateruse( ... model="wu-public-supply-wd", ... variable=["pswdtot", "pswdgw", "pswdsw"], ... state="RI", ... start_date="2020-01", ... time_resolution="monthly", ... )