dataretrieval.wqp

Tool for downloading data from the Water Quality Portal (https://waterqualitydata.us)

See https://waterqualitydata.us/webservices_documentation for API reference

Todo

  • implement other services like Organization, Activity, etc.

class dataretrieval.wqp.WQP_Metadata(response: httpx.Response, **parameters: Any)[source]

Metadata class for WQP service, derived from BaseMetadata.

url

Response url

Type:

str

query_time

Response elapsed time

Type:

datetime.timedelta

header

Response headers

Type:

httpx.Headers

comment

WQP does not return comments.

Type:

None

site_info

Site information (via what_sites) if the query included a siteid.

Type:

tuple[pd.DataFrame, WQP_Metadata] | None

__init__(response: httpx.Response, **parameters: Any) None[source]

Generates a standard set of metadata informed by the response with specific metadata for WQP data.

Parameters:
  • response (httpx.Response) – Response object from the httpx module.

  • parameters (dict) – Unpacked dictionary of the parameters supplied in the request

property site_info: tuple[DataFrame, WQP_Metadata] | None

Site information for the query.

Populated (via dataretrieval.wqp.what_sites()) when the query included a siteid (the WQP site identifier, e.g. "USGS-05586100"); None otherwise.

Returns:

dataretrieval.wqp._check_kwargs(kwargs: dict[str, Any]) dict[str, Any][source]

Private function to check kwargs for unsupported parameters.

dataretrieval.wqp._is_code_column(name: str) bool[source]

True if a WQP column name denotes a code/identifier whose leading zeros are significant and must be preserved as str (HUCs, parameter codes, FIPS codes): the name ends with “code” or contains “identifier”/”huc”/”fips”.

dataretrieval.wqp._legacy_only_url(service: str, legacy: bool) str[source]

URL builder for WQP services that have no WQX3.0 equivalent.

When legacy=False is passed to one of these helpers we emit a UserWarning explaining the fallback and also suppress the legacy DeprecationWarning that wqp_url would otherwise raise — its message claims setting legacy=False removes the warning, which is a lie for endpoints that have no WQX3.0 alternative.

dataretrieval.wqp._read_wqp_csv(text: str) DataFrame[source]

Read a WQP CSV, forcing code/identifier columns to str.

WQP returns codes with significant leading zeros — HUCs, parameter codes (USGSpcode), FIPS state/county codes. A bare read_csv infers those as int/float and silently drops the zeros ("00060" -> 60, HUC8 "07090002" -> 7090002). Read the header first, then re-read with dtype=str for every column that _is_code_column() flags, so the zeros survive.

dataretrieval.wqp._what(service: str, *, ssl_check: bool, legacy: bool, **kwargs: Any) tuple[DataFrame, WQP_Metadata][source]

Shared implementation for the what_* metadata search functions.

service is the WQP service name (e.g. "Station"). Services with a WQX3.0 equivalent (those in services_wqx3) use wqx3_url() when legacy=False and wqp_url() otherwise; legacy-only services route through _legacy_only_url(), which warns and falls back to the legacy profile. The CSV response is parsed via _read_wqp_csv().

dataretrieval.wqp.get_results(ssl_check: bool = True, legacy: bool = True, **kwargs: Any) tuple[DataFrame, WQP_Metadata][source]

Query the WQP for results.

Any WQP API parameter can be passed as a keyword argument to this function. More information about the API can be found at: https://www.waterqualitydata.us/#advanced=true or the beta version of the WQX3.0 API at: https://www.waterqualitydata.us/beta/#mimeType=csv&providers=NWIS&providers=STORET or the Swagger documentation at: https://www.waterqualitydata.us/data/swagger-ui/index.html?docExpansion=none&url=/data/v3/api-docs#/

Parameters:
  • ssl_check (bool, optional) – Check the SSL certificate.

  • legacy (bool, optional) – Return the legacy WQX data profile. Default is True.

  • dataProfile (string, optional) – Specifies the data fields returned by the query. WQX3.0 profiles include ‘fullPhysChem’, ‘narrow’, and ‘basicPhysChem’. Legacy profiles include ‘resultPhysChem’, ‘biological’, and ‘narrowResult’. For WQX3.0 queries (legacy=False), defaults to ‘fullPhysChem’; legacy queries have no default profile.

  • siteid (string) – Monitoring location identified by agency code, a hyphen, and identification number (Example: “USGS-05586100”).

  • statecode (string) – US state FIPS code (Example: Illinois is “US:17”).

  • countycode (string) – US county FIPS code.

  • huc (string) – Eight-digit hydrologic unit (HUC), delimited by semicolons.

  • bBox (string) – Search bounding box (Example: bBox=-92.8,44.2,-88.9,46.0)

  • lat (string) – Radial-search central latitude in WGS84 decimal degrees.

  • long (string) – Radial-search central longitude in WGS84 decimal degrees.

  • within (string) – Radial-search distance in decimal miles.

  • pCode (string) – Five-digit USGS parameter code, delimited by semicolons. NWIS only.

  • startDateLo (string) – Date of earliest desired data-collection activity, expressed as ‘MM-DD-YYYY’

  • startDateHi (string) – Date of last desired data-collection activity, expressed as ‘MM-DD-YYYY’

  • characteristicName (string) – One or more case-sensitive characteristic names, separated by semicolons (https://www.waterqualitydata.us/public_srsnames/).

  • mimeType (string) – Output format. Only ‘csv’ is supported at this time.

Returns:

  • df (pandas.DataFrame) – Formatted data returned from the API query. For each <prefix>Date / <prefix>Time / <prefix>TimeZone triplet in the response (legacy WQP uses <prefix>Time/Time and <prefix>Time/TimeZoneCode), an additional <prefix>DateTime column is appended holding a UTC Timestamp. Original triplet columns are preserved; unrecognized timezone codes yield NaT. Rows are sorted by ActivityStartDateTime (or Activity_StartDateTime for WQX3 responses) when present.

  • md (dataretrieval.wqp.WQP_Metadata) – Custom dataretrieval metadata object pertaining to the query.

Examples

>>> # Get results within a radial distance of a point
>>> df, md = dataretrieval.wqp.get_results(
...     lat="44.2", long="-88.9", within="0.5"
... )

>>> # Get results within a bounding box
>>> df, md = dataretrieval.wqp.get_results(bBox="-92.8,44.2,-88.9,46.0")

>>> # Get results using a new WQX3.0 profile
>>> df, md = dataretrieval.wqp.get_results(
...     legacy=False, siteid="UTAHDWQ_WQX-4993795", dataProfile="narrow"
... )
dataretrieval.wqp.what_activities(ssl_check: bool = True, legacy: bool = True, **kwargs: Any) tuple[DataFrame, WQP_Metadata][source]

Search WQP for activities within a region with specific data.

Any WQP API parameter can be passed as a keyword argument to this function. More information about the API can be found at: https://www.waterqualitydata.us/#advanced=true or the beta version of the WQX3.0 API at: https://www.waterqualitydata.us/beta/#mimeType=csv&providers=NWIS&providers=STORET or the Swagger documentation at: https://www.waterqualitydata.us/data/swagger-ui/index.html?docExpansion=none&url=/data/v3/api-docs#/

Parameters:
  • ssl_check (bool, optional) – Check the SSL certificate. Default is True.

  • legacy (bool, optional) – Return the legacy WQX data profile. Default is True.

  • **kwargs (optional) – Accepts the same parameters as dataretrieval.wqp.get_results

Returns:

  • df (pandas.DataFrame) – Formatted data returned from the API query.

  • md (dataretrieval.wqp.WQP_Metadata) – Custom metadata object pertaining to the query.

Examples

>>> # Get activities within Washington D.C.
>>> # during a specific time period
>>> df, md = dataretrieval.wqp.what_activities(
...     statecode="US:11",
...     startDateLo="12-30-2019",
...     startDateHi="01-01-2020",
... )

>>> # Get activities within Washington D.C.
>>> # using the WQX3.0 profile during a specific time period
>>> df, md = dataretrieval.wqp.what_activities(
...     legacy=False,
...     statecode="US:11",
...     startDateLo="12-30-2019",
...     startDateHi="01-01-2020",
... )
dataretrieval.wqp.what_activity_metrics(ssl_check: bool = True, legacy: bool = True, **kwargs: Any) tuple[DataFrame, WQP_Metadata][source]

Search WQP for activity metrics within a region with specific data.

Any WQP API parameter can be passed as a keyword argument to this function. More information about the API can be found at: https://www.waterqualitydata.us/#advanced=true or the beta version of the WQX3.0 API at: https://www.waterqualitydata.us/beta/#mimeType=csv&providers=NWIS&providers=STORET or the Swagger documentation at: https://www.waterqualitydata.us/data/swagger-ui/index.html?docExpansion=none&url=/data/v3/api-docs#/

Parameters:
  • ssl_check (bool) – Check the SSL certificate. Default is True.

  • legacy (bool) – Return the legacy WQX data profile. Default is True.

  • **kwargs (optional) – Accepts the same parameters as dataretrieval.wqp.get_results

Returns:

  • df (pandas.DataFrame) – Formatted data returned from the API query.

  • md (dataretrieval.wqp.WQP_Metadata) – Custom metadata object pertaining to the query.

Examples

>>> # Get activity metrics for a state (North Dakota in this case)
>>> # within a set time period
>>> df, md = dataretrieval.wqp.what_activity_metrics(
...     statecode="US:38",
...     startDateLo="07-01-2006",
...     startDateHi="12-01-2006",
... )
dataretrieval.wqp.what_detection_limits(ssl_check: bool = True, legacy: bool = True, **kwargs: Any) tuple[DataFrame, WQP_Metadata][source]

Search WQP for result detection limits within a region with specific data.

Any WQP API parameter can be passed as a keyword argument to this function. More information about the API can be found at: https://www.waterqualitydata.us/#advanced=true or the beta version of the WQX3.0 API at: https://www.waterqualitydata.us/beta/#mimeType=csv&providers=NWIS&providers=STORET or the Swagger documentation at: https://www.waterqualitydata.us/data/swagger-ui/index.html?docExpansion=none&url=/data/v3/api-docs#/

Parameters:
  • ssl_check (bool) – Check the SSL certificate. Default is True.

  • legacy (bool) – Return the legacy WQX data profile. Default is True.

  • **kwargs (optional) – Accepts the same parameters as dataretrieval.wqp.get_results

Returns:

  • df (pandas.DataFrame) – Formatted data returned from the API query.

  • md (dataretrieval.wqp.WQP_Metadata) – Custom metadata object pertaining to the query.

Examples

>>> # Get detection limits for Nitrite measurements in Rhode Island
>>> # between specific dates
>>> df, md = dataretrieval.wqp.what_detection_limits(
...     statecode="US:44",
...     characteristicName="Nitrite",
...     startDateLo="01-01-2021",
...     startDateHi="02-20-2021",
... )
dataretrieval.wqp.what_habitat_metrics(ssl_check: bool = True, legacy: bool = True, **kwargs: Any) tuple[DataFrame, WQP_Metadata][source]

Search WQP for habitat metrics within a region with specific data.

Any WQP API parameter can be passed as a keyword argument to this function. More information about the API can be found at: https://www.waterqualitydata.us/#advanced=true or the beta version of the WQX3.0 API at: https://www.waterqualitydata.us/beta/#mimeType=csv&providers=NWIS&providers=STORET or the Swagger documentation at: https://www.waterqualitydata.us/data/swagger-ui/index.html?docExpansion=none&url=/data/v3/api-docs#/

Parameters:
  • ssl_check (bool) – Check the SSL certificate. Default is True.

  • legacy (bool) – Return the legacy WQX data profile. Default is True.

  • **kwargs (optional) – Accepts the same parameters as dataretrieval.wqp.get_results

Returns:

  • df (pandas.DataFrame) – Formatted data returned from the API query.

  • md (dataretrieval.wqp.WQP_Metadata) – Custom metadata object pertaining to the query.

Examples

>>> # Get habitat metrics for a state (Rhode Island in this case)
>>> df, md = dataretrieval.wqp.what_habitat_metrics(statecode="US:44")
dataretrieval.wqp.what_organizations(ssl_check: bool = True, legacy: bool = True, **kwargs: Any) tuple[DataFrame, WQP_Metadata][source]

Search WQP for organizations within a region with specific data.

Any WQP API parameter can be passed as a keyword argument to this function. More information about the API can be found at: https://www.waterqualitydata.us/#advanced=true or the beta version of the WQX3.0 API at: https://www.waterqualitydata.us/beta/#mimeType=csv&providers=NWIS&providers=STORET or the Swagger documentation at: https://www.waterqualitydata.us/data/swagger-ui/index.html?docExpansion=none&url=/data/v3/api-docs#/

Parameters:
  • ssl_check (bool, optional) – Check the SSL certificate. Default is True.

  • legacy (bool, optional) – Return the legacy WQX data profile. Default is True.

  • **kwargs (optional) – Accepts the same parameters as dataretrieval.wqp.get_results

Returns:

  • df (pandas.DataFrame) – Formatted data returned from the API query.

  • md (dataretrieval.wqp.WQP_Metadata) – Custom metadata object pertaining to the query.

Examples

>>> # Get all organizations in the WQP
>>> df, md = dataretrieval.wqp.what_organizations()
dataretrieval.wqp.what_project_weights(ssl_check: bool = True, legacy: bool = True, **kwargs: Any) tuple[DataFrame, WQP_Metadata][source]

Search WQP for project weights within a region with specific data.

Any WQP API parameter can be passed as a keyword argument to this function. More information about the API can be found at: https://www.waterqualitydata.us/#advanced=true or the beta version of the WQX3.0 API at: https://www.waterqualitydata.us/beta/#mimeType=csv&providers=NWIS&providers=STORET or the Swagger documentation at: https://www.waterqualitydata.us/data/swagger-ui/index.html?docExpansion=none&url=/data/v3/api-docs#/

Parameters:
  • ssl_check (bool) – Check the SSL certificate. Default is True.

  • legacy (bool) – Return the legacy WQX data profile. Default is True.

  • **kwargs (optional) – Accepts the same parameters as dataretrieval.wqp.get_results

Returns:

  • df (pandas.DataFrame) – Formatted data returned from the API query.

  • md (dataretrieval.wqp.WQP_Metadata) – Custom metadata object pertaining to the query.

Examples

>>> # Get project weights for a state (North Dakota in this case)
>>> # within a set time period
>>> df, md = dataretrieval.wqp.what_project_weights(
...     statecode="US:38",
...     startDateLo="01-01-2006",
...     startDateHi="01-01-2009",
... )
dataretrieval.wqp.what_projects(ssl_check: bool = True, legacy: bool = True, **kwargs: Any) tuple[DataFrame, WQP_Metadata][source]

Search WQP for projects within a region with specific data.

Any WQP API parameter can be passed as a keyword argument to this function. More information about the API can be found at: https://www.waterqualitydata.us/#advanced=true or the beta version of the WQX3.0 API at: https://www.waterqualitydata.us/beta/#mimeType=csv&providers=NWIS&providers=STORET or the Swagger documentation at: https://www.waterqualitydata.us/data/swagger-ui/index.html?docExpansion=none&url=/data/v3/api-docs#/

Parameters:
  • ssl_check (bool, optional) – Check the SSL certificate. Default is True.

  • legacy (bool, optional) – Return the legacy WQX data profile. Default is True.

  • **kwargs (optional) – Accepts the same parameters as dataretrieval.wqp.get_results

Returns:

  • df (pandas.DataFrame) – Formatted data returned from the API query.

  • md (dataretrieval.wqp.WQP_Metadata) – Custom metadata object pertaining to the query.

Examples

>>> # Get projects within a HUC region
>>> df, md = dataretrieval.wqp.what_projects(huc="19")
dataretrieval.wqp.what_sites(ssl_check: bool = True, legacy: bool = True, **kwargs: Any) tuple[DataFrame, WQP_Metadata][source]

Search WQP for sites within a region with specific data.

Any WQP API parameter can be passed as a keyword argument to this function. More information about the API can be found at: https://www.waterqualitydata.us/#advanced=true or the beta version of the WQX3.0 API at: https://www.waterqualitydata.us/beta/#mimeType=csv&providers=NWIS&providers=STORET or the Swagger documentation at: https://www.waterqualitydata.us/data/swagger-ui/index.html?docExpansion=none&url=/data/v3/api-docs#/

Parameters:
  • ssl_check (bool, optional) – Check the SSL certificate. Default is True.

  • legacy (bool, optional) – If True, returns the legacy WQX data profile and warns the user of the issues associated with it. If False, returns the new WQX3.0 profile, if available. Defaults to True.

  • **kwargs (optional) – Accepts the same parameters as dataretrieval.wqp.get_results

Returns:

  • df (pandas.DataFrame) – Formatted data returned from the API query.

  • md (dataretrieval.wqp.WQP_Metadata) – Custom metadata object pertaining to the query.

Examples

>>> # Get sites within a radial distance of a point
>>> df, md = dataretrieval.wqp.what_sites(
...     lat="44.2", long="-88.9", within="2.5"
... )
dataretrieval.wqp.wqp_url(service: str) str[source]

Construct the WQP URL for a given service.

dataretrieval.wqp.wqx3_url(service: str) str[source]

Construct the WQP URL for a given WQX 3.0 service.