dataretrieval.utils

Useful utilities for data munging.

class dataretrieval.utils.BaseMetadata(response)[source]

Base class for metadata.

url

Response url

Type:

str

query_time

Response elapsed time

Type:

datetme.timedelta

header

Response headers

Type:

requests.structures.CaseInsensitiveDict

__init__(response) None[source]

Generates a standard set of metadata informed by the response.

Parameters:

response (Response) – Response object from requests module

Returns:

md – A dataretrieval custom dataretrieval.utils.BaseMetadata object.

Return type:

dataretrieval.utils.BaseMetadata

__repr__() str[source]

Return repr(self).

__weakref__

list of weak references to the object

exception dataretrieval.utils.NoSitesError(url)[source]

Custom error class used when selection criteria returns no sites/data.

__init__(url)[source]
__str__()[source]

Return str(self).

__weakref__

list of weak references to the object

dataretrieval.utils._attach_datetime_columns(df: DataFrame) DataFrame[source]

Add <prefix>DateTime UTC columns for any Date/Time/TimeZone triplets and sort the frame by the activity-start datetime.

Detects two naming patterns that appear in USGS Samples and Water Quality Portal CSV responses:

  • WQX3<prefix>Date, <prefix>Time, <prefix>TimeZone

  • Legacy WQP<prefix>Date, <prefix>Time/Time, <prefix>Time/TimeZoneCode

For every triplet present, a new <prefix>DateTime column is appended holding a UTC Timestamp (offsets resolved via dataretrieval.codes.tz). The original Date/Time/TimeZone columns are left intact, and an existing <prefix>DateTime column is never overwritten.

Rows are sorted (and the index reset) by the canonical activity-start datetime when present — Activity_StartDateTime (WQX3) or ActivityStartDateTime (legacy WQP) — falling back to the first detected *Date column. Mirrors R dataRetrieval’s end-of-pipeline sort in importWQP.R.

Parameters:

df (pandas.DataFrame) – DataFrame returned from a Samples or WQP CSV endpoint.

Returns:

df – A new DataFrame with derivable <prefix>DateTime columns appended and rows sorted by the activity-start datetime (if any date column was detected).

Return type:

pandas.DataFrame

dataretrieval.utils._build_utc_datetime(date_series: Series, time_series: Series, tz_series: Series) Series[source]

Combine date + time + tz-abbreviation columns into a UTC pandas Series.

Unknown timezone codes (and rows missing any of the three values) yield NaT. The input columns are not mutated.

dataretrieval.utils.format_datetime(df, date_field, time_field, tz_field)[source]

Creates a datetime field from separate date, time, and time zone fields.

Assumes ISO 8601.

Parameters:
  • df (pandas.DataFrame) – A data frame containing date, time, and timezone fields.

  • date_field (string) – Name of date column in df.

  • time_field (string) – Name of time column in df.

  • tz_field (string) – Name of time zone column in df.

Returns:

df – The data frame with a formatted ‘datetime’ column

Return type:

pandas.DataFrame

dataretrieval.utils.query(url, payload, delimiter=',', ssl_check=True)[source]

Send a query.

Wrapper for requests.get that handles errors, converts listed query parameters to comma separated strings, and returns response.

Parameters:
  • url (string) – URL to query

  • payload (dict) – query parameters passed to requests.get

  • delimiter (string) – delimiter to use with lists

  • ssl_check (bool) – If True, check SSL certificates, if False, do not check SSL, default is True

Returns:

string – The response from the API query requests.get function call.

Return type:

query response

dataretrieval.utils.to_str(listlike, delimiter=',')[source]

Translates list-like objects into strings.

Parameters:
  • listlike (list-like object) – An object that is a list, or list-like (e.g., pandas.core.series.Series)

  • delimiter (string, optional) – The delimiter that is placed between entries in listlike when it is turned into a string. Default value is a comma.

Returns:

listlike – The listlike object as string separated by the delimiter

Return type:

string

Examples

>>> dataretrieval.utils.to_str([1, "a", 2])
'1,a,2'

>>> dataretrieval.utils.to_str([0, 10, 42], delimiter="+")
'0+10+42'