USGS dataretrieval Python Package get_iv() Examples

This notebook provides examples of using the Python dataretrieval package to retrieve instantaneous values data for a United States Geological Survey (USGS) monitoring site. The dataretrieval package provides a collection of functions to get data from the USGS National Water Information System (NWIS) and other online sources of hydrology and water quality data, including the United States Environmental Protection Agency (USEPA).

Install the Package

Use the following code to install the package if it doesn’t exist already within your Jupyter Python environment.

[1]:
!pip install dataretrieval
Requirement already satisfied: dataretrieval in /opt/hostedtoolcache/Python/3.13.13/x64/lib/python3.13/site-packages (0.1.dev1+g4a65fb16e)
Requirement already satisfied: requests in /opt/hostedtoolcache/Python/3.13.13/x64/lib/python3.13/site-packages (from dataretrieval) (2.34.2)
Requirement already satisfied: pandas<4.0.0,>=2.0.0 in /opt/hostedtoolcache/Python/3.13.13/x64/lib/python3.13/site-packages (from dataretrieval) (3.0.3)
Requirement already satisfied: numpy>=1.26.0 in /opt/hostedtoolcache/Python/3.13.13/x64/lib/python3.13/site-packages (from pandas<4.0.0,>=2.0.0->dataretrieval) (2.4.5)
Requirement already satisfied: python-dateutil>=2.8.2 in /opt/hostedtoolcache/Python/3.13.13/x64/lib/python3.13/site-packages (from pandas<4.0.0,>=2.0.0->dataretrieval) (2.9.0.post0)
Requirement already satisfied: six>=1.5 in /opt/hostedtoolcache/Python/3.13.13/x64/lib/python3.13/site-packages (from python-dateutil>=2.8.2->pandas<4.0.0,>=2.0.0->dataretrieval) (1.17.0)
Requirement already satisfied: charset_normalizer<4,>=2 in /opt/hostedtoolcache/Python/3.13.13/x64/lib/python3.13/site-packages (from requests->dataretrieval) (3.4.7)
Requirement already satisfied: idna<4,>=2.5 in /opt/hostedtoolcache/Python/3.13.13/x64/lib/python3.13/site-packages (from requests->dataretrieval) (3.15)
Requirement already satisfied: urllib3<3,>=1.26 in /opt/hostedtoolcache/Python/3.13.13/x64/lib/python3.13/site-packages (from requests->dataretrieval) (2.7.0)
Requirement already satisfied: certifi>=2023.5.7 in /opt/hostedtoolcache/Python/3.13.13/x64/lib/python3.13/site-packages (from requests->dataretrieval) (2026.4.22)

Load the package so you can use it along with other packages used in this notebook.

[2]:
from datetime import date

from IPython.display import display

from dataretrieval import nwis
import dataretrieval.waterdata as waterdata

Basic Usage

The dataretrieval package has several functions that allow you to retrieve data from different web services. This example uses the get_iv() function to retrieve instantaneous streamflow data for a USGS monitoring site from NWIS. The following arguments are supported:

  • sites (string or list of strings): A list of USGS site identifiers for which to retrieve data.

  • parameterCd (string or list of strings): A list of USGS parameter codes for which to retrieve data.

  • start (string): The beginning date for a period for which to retrieve data. If the waterdata parameter startDt is supplied, it will overwrite the start parameter.

  • end (string): The ending date for a period for which to retrieve data. If the waterdata parameter endDt is supplied, it will overwrite the end parameter.

Example 1: Get unit value data for a specific parameter at a USGS NWIS monitoring site between a begin and end date

[3]:
# Set the parameters needed for the web service call
siteID = "10109000"  # LOGAN RIVER ABOVE STATE DAM, NEAR LOGAN, UT
parameterCode = "00060"  # Discharge
startDate = "2021-09-01"
endDate = "2021-09-30"

# Get the data
discharge = waterdata.get_continuous(
    monitoring_location_id=siteID, parameter_code=parameterCode, time=f"{startDate}/{endDate}"
)
print("Retrieved " + str(len(discharge[0])) + " data values.")
---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
Cell In[3], line 8
      4 startDate = "2021-09-01"
      5 endDate = "2021-09-30"
      6
      7 # Get the data
----> 8 discharge = waterdata.get_continuous(
      9     monitoring_location_id=siteID, parameter_code=parameterCode, time=f"{startDate}/{endDate}"
     10 )
     11 print("Retrieved " + str(len(discharge[0])) + " data values.")

File /opt/hostedtoolcache/Python/3.13.13/x64/lib/python3.13/site-packages/dataretrieval/waterdata/api.py:426, in get_continuous(monitoring_location_id, parameter_code, statistic_id, properties, time_series_id, continuous_id, approval_status, unit_of_measure, qualifier, value, last_modified, time, limit, filter, filter_lang, convert_type)
    423 output_id = "continuous_id"
    425 # Build argument dictionary, omitting None values
--> 426 args = _get_args(locals())
    428 return get_ogc_data(args, output_id, service)

File /opt/hostedtoolcache/Python/3.13.13/x64/lib/python3.13/site-packages/dataretrieval/waterdata/utils.py:1395, in _get_args(local_vars, exclude)
   1393     continue
   1394 if k == "monitoring_location_id":
-> 1395     args[k] = _check_monitoring_location_id(v)
   1396 elif k == "properties":
   1397     # `",".join(properties)` would iterate a bare string as characters.
   1398     args[k] = [v] if isinstance(v, str) else _normalize_str_iterable(v, k)

File /opt/hostedtoolcache/Python/3.13.13/x64/lib/python3.13/site-packages/dataretrieval/waterdata/utils.py:1351, in _check_monitoring_location_id(monitoring_location_id)
   1349     return None
   1350 for item in (value,) if isinstance(value, str) else value:
-> 1351     _check_id_format(item)
   1352 return value

File /opt/hostedtoolcache/Python/3.13.13/x64/lib/python3.13/site-packages/dataretrieval/waterdata/utils.py:1358, in _check_id_format(value)
   1356 """Raise ``ValueError`` if ``value`` is not in ``AGENCY-ID`` format."""
   1357 if not _MONITORING_LOCATION_ID_RE.fullmatch(value):
-> 1358     raise ValueError(
   1359         f"Invalid monitoring_location_id: {value!r}. "
   1360         f"Expected 'AGENCY-ID' format, e.g., 'USGS-01646500'."
   1361     )

ValueError: Invalid monitoring_location_id: '10109000'. Expected 'AGENCY-ID' format, e.g., 'USGS-01646500'.

Interpreting the Result

The result of calling the get_iv() function is an object that contains a Pandas data frame object and an associated metadata object. The Pandas data frame contains the values for the observed variable and time period requested. The data frame is indexed by the dates associated with the data values.

Once you’ve got the data frame, there’s several useful things you can do to explore the data.

[4]:
# Display the data frame as a table
display(discharge[0])
---------------------------------------------------------------------------
NameError                                 Traceback (most recent call last)
Cell In[4], line 2
      1 # Display the data frame as a table
----> 2 display(discharge[0])

NameError: name 'discharge' is not defined

Show the data types of the columns in the resulting data frame.

[5]:
print(discharge[0].dtypes)
---------------------------------------------------------------------------
NameError                                 Traceback (most recent call last)
Cell In[5], line 1
----> 1 print(discharge[0].dtypes)

NameError: name 'discharge' is not defined

Get summary statistics for the daily streamflow values.

[6]:
discharge[0].describe()
---------------------------------------------------------------------------
NameError                                 Traceback (most recent call last)
Cell In[6], line 1
----> 1 discharge[0].describe()

NameError: name 'discharge' is not defined

Make a quick time series plot.

[7]:
ax = discharge[0].plot(x="time", y="value", style=".")
ax.set_xlabel("Date")
ax.set_ylabel("Streamflow (cfs)")
---------------------------------------------------------------------------
NameError                                 Traceback (most recent call last)
Cell In[7], line 1
----> 1 ax = discharge[0].plot(x="time", y="value", style=".")
      2 ax.set_xlabel("Date")
      3 ax.set_ylabel("Streamflow (cfs)")

NameError: name 'discharge' is not defined

The other part of the result returned from the get_iv() function is a metadata object that contains information about the query that was executed to return the data. For example, you can access the URL that was assembled to retrieve the requested data from the USGS web service. The USGS web service responses contain a descriptive header that defines and can be helpful in interpreting the contents of the response.

[8]:
print("The query URL used to retrieve the data from NWIS was: " + discharge[1].url)
---------------------------------------------------------------------------
NameError                                 Traceback (most recent call last)
Cell In[8], line 1
----> 1 print("The query URL used to retrieve the data from NWIS was: " + discharge[1].url)

NameError: name 'discharge' is not defined

Additional Examples

Example 2: Get unit values for an individual site and parameter between a start and end date.

NOTE: By default, start and end date are evaluated as local time, and the result is returned with the timestamps in the local time of the monitoring site.

[9]:
site_id = "05114000"
startDate = "2014-10-10"
endDate = "2014-10-10"

discharge2 = waterdata.get_continuous(
    monitoring_location_id=site_id, parameter_code=parameterCode, time=f"{startDate}/{endDate}"
)
print("Retrieved " + str(len(discharge2[0])) + " data values.")
display(discharge2[0])
---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
Cell In[9], line 5
      1 site_id = "05114000"
      2 startDate = "2014-10-10"
      3 endDate = "2014-10-10"
      4
----> 5 discharge2 = waterdata.get_continuous(
      6     monitoring_location_id=site_id, parameter_code=parameterCode, time=f"{startDate}/{endDate}"
      7 )
      8 print("Retrieved " + str(len(discharge2[0])) + " data values.")

File /opt/hostedtoolcache/Python/3.13.13/x64/lib/python3.13/site-packages/dataretrieval/waterdata/api.py:426, in get_continuous(monitoring_location_id, parameter_code, statistic_id, properties, time_series_id, continuous_id, approval_status, unit_of_measure, qualifier, value, last_modified, time, limit, filter, filter_lang, convert_type)
    423 output_id = "continuous_id"
    425 # Build argument dictionary, omitting None values
--> 426 args = _get_args(locals())
    428 return get_ogc_data(args, output_id, service)

File /opt/hostedtoolcache/Python/3.13.13/x64/lib/python3.13/site-packages/dataretrieval/waterdata/utils.py:1395, in _get_args(local_vars, exclude)
   1393     continue
   1394 if k == "monitoring_location_id":
-> 1395     args[k] = _check_monitoring_location_id(v)
   1396 elif k == "properties":
   1397     # `",".join(properties)` would iterate a bare string as characters.
   1398     args[k] = [v] if isinstance(v, str) else _normalize_str_iterable(v, k)

File /opt/hostedtoolcache/Python/3.13.13/x64/lib/python3.13/site-packages/dataretrieval/waterdata/utils.py:1351, in _check_monitoring_location_id(monitoring_location_id)
   1349     return None
   1350 for item in (value,) if isinstance(value, str) else value:
-> 1351     _check_id_format(item)
   1352 return value

File /opt/hostedtoolcache/Python/3.13.13/x64/lib/python3.13/site-packages/dataretrieval/waterdata/utils.py:1358, in _check_id_format(value)
   1356 """Raise ``ValueError`` if ``value`` is not in ``AGENCY-ID`` format."""
   1357 if not _MONITORING_LOCATION_ID_RE.fullmatch(value):
-> 1358     raise ValueError(
   1359         f"Invalid monitoring_location_id: {value!r}. "
   1360         f"Expected 'AGENCY-ID' format, e.g., 'USGS-01646500'."
   1361     )

ValueError: Invalid monitoring_location_id: '05114000'. Expected 'AGENCY-ID' format, e.g., 'USGS-01646500'.

Example 3: Get unit values for an individual site for today

[10]:
today = str(date.today())
discharge_today = waterdata.get_continuous(
    monitoring_location_id=site_id, parameter_code=parameterCode, time=f"{today}/{today}"
)
print("Retrieved " + str(len(discharge_today[0])) + " data values.")
display(discharge_today[0])
---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
Cell In[10], line 2
      1 today = str(date.today())
----> 2 discharge_today = waterdata.get_continuous(
      3     monitoring_location_id=site_id, parameter_code=parameterCode, time=f"{today}/{today}"
      4 )
      5 print("Retrieved " + str(len(discharge_today[0])) + " data values.")

File /opt/hostedtoolcache/Python/3.13.13/x64/lib/python3.13/site-packages/dataretrieval/waterdata/api.py:426, in get_continuous(monitoring_location_id, parameter_code, statistic_id, properties, time_series_id, continuous_id, approval_status, unit_of_measure, qualifier, value, last_modified, time, limit, filter, filter_lang, convert_type)
    423 output_id = "continuous_id"
    425 # Build argument dictionary, omitting None values
--> 426 args = _get_args(locals())
    428 return get_ogc_data(args, output_id, service)

File /opt/hostedtoolcache/Python/3.13.13/x64/lib/python3.13/site-packages/dataretrieval/waterdata/utils.py:1395, in _get_args(local_vars, exclude)
   1393     continue
   1394 if k == "monitoring_location_id":
-> 1395     args[k] = _check_monitoring_location_id(v)
   1396 elif k == "properties":
   1397     # `",".join(properties)` would iterate a bare string as characters.
   1398     args[k] = [v] if isinstance(v, str) else _normalize_str_iterable(v, k)

File /opt/hostedtoolcache/Python/3.13.13/x64/lib/python3.13/site-packages/dataretrieval/waterdata/utils.py:1351, in _check_monitoring_location_id(monitoring_location_id)
   1349     return None
   1350 for item in (value,) if isinstance(value, str) else value:
-> 1351     _check_id_format(item)
   1352 return value

File /opt/hostedtoolcache/Python/3.13.13/x64/lib/python3.13/site-packages/dataretrieval/waterdata/utils.py:1358, in _check_id_format(value)
   1356 """Raise ``ValueError`` if ``value`` is not in ``AGENCY-ID`` format."""
   1357 if not _MONITORING_LOCATION_ID_RE.fullmatch(value):
-> 1358     raise ValueError(
   1359         f"Invalid monitoring_location_id: {value!r}. "
   1360         f"Expected 'AGENCY-ID' format, e.g., 'USGS-01646500'."
   1361     )

ValueError: Invalid monitoring_location_id: '05114000'. Expected 'AGENCY-ID' format, e.g., 'USGS-01646500'.

Example 4: Retrieve data using UTC times

NOTE: Adding ‘Z’ to the input time parameters indicates that they are in UTC rather than local time. The time stamps associated with the data returned are still in the local time of the USGS monitoring site.

[11]:
discharge_UTC = waterdata.get_continuous(
    monitoring_location_id=site_id,
    parameter_code=parameterCode,
    time="2014-10-10T00:00Z/2014-10-10T23:59Z",
)
print("Retrieved " + str(len(discharge_UTC[0])) + " data values.")
display(discharge_UTC[0])
---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
Cell In[11], line 1
----> 1 discharge_UTC = waterdata.get_continuous(
      2     monitoring_location_id=site_id,
      3     parameter_code=parameterCode,
      4     time="2014-10-10T00:00Z/2014-10-10T23:59Z",

File /opt/hostedtoolcache/Python/3.13.13/x64/lib/python3.13/site-packages/dataretrieval/waterdata/api.py:426, in get_continuous(monitoring_location_id, parameter_code, statistic_id, properties, time_series_id, continuous_id, approval_status, unit_of_measure, qualifier, value, last_modified, time, limit, filter, filter_lang, convert_type)
    423 output_id = "continuous_id"
    425 # Build argument dictionary, omitting None values
--> 426 args = _get_args(locals())
    428 return get_ogc_data(args, output_id, service)

File /opt/hostedtoolcache/Python/3.13.13/x64/lib/python3.13/site-packages/dataretrieval/waterdata/utils.py:1395, in _get_args(local_vars, exclude)
   1393     continue
   1394 if k == "monitoring_location_id":
-> 1395     args[k] = _check_monitoring_location_id(v)
   1396 elif k == "properties":
   1397     # `",".join(properties)` would iterate a bare string as characters.
   1398     args[k] = [v] if isinstance(v, str) else _normalize_str_iterable(v, k)

File /opt/hostedtoolcache/Python/3.13.13/x64/lib/python3.13/site-packages/dataretrieval/waterdata/utils.py:1351, in _check_monitoring_location_id(monitoring_location_id)
   1349     return None
   1350 for item in (value,) if isinstance(value, str) else value:
-> 1351     _check_id_format(item)
   1352 return value

File /opt/hostedtoolcache/Python/3.13.13/x64/lib/python3.13/site-packages/dataretrieval/waterdata/utils.py:1358, in _check_id_format(value)
   1356 """Raise ``ValueError`` if ``value`` is not in ``AGENCY-ID`` format."""
   1357 if not _MONITORING_LOCATION_ID_RE.fullmatch(value):
-> 1358     raise ValueError(
   1359         f"Invalid monitoring_location_id: {value!r}. "
   1360         f"Expected 'AGENCY-ID' format, e.g., 'USGS-01646500'."
   1361     )

ValueError: Invalid monitoring_location_id: '05114000'. Expected 'AGENCY-ID' format, e.g., 'USGS-01646500'.

Example 5: Get unit values for two sites, for a single parameter, between a start and end date

[12]:
discharge_multisite = waterdata.get_continuous(
    monitoring_location_id=["04024430", "04024000"],
    parameter_code=parameterCode,
    time="2013-10-01/2013-10-01",
)
print("Retrieved " + str(len(discharge_multisite[0])) + " data values.")
display(discharge_multisite[0])
---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
Cell In[12], line 1
----> 1 discharge_multisite = waterdata.get_continuous(
      2     monitoring_location_id=["04024430", "04024000"],
      3     parameter_code=parameterCode,
      4     time="2013-10-01/2013-10-01",

File /opt/hostedtoolcache/Python/3.13.13/x64/lib/python3.13/site-packages/dataretrieval/waterdata/api.py:426, in get_continuous(monitoring_location_id, parameter_code, statistic_id, properties, time_series_id, continuous_id, approval_status, unit_of_measure, qualifier, value, last_modified, time, limit, filter, filter_lang, convert_type)
    423 output_id = "continuous_id"
    425 # Build argument dictionary, omitting None values
--> 426 args = _get_args(locals())
    428 return get_ogc_data(args, output_id, service)

File /opt/hostedtoolcache/Python/3.13.13/x64/lib/python3.13/site-packages/dataretrieval/waterdata/utils.py:1395, in _get_args(local_vars, exclude)
   1393     continue
   1394 if k == "monitoring_location_id":
-> 1395     args[k] = _check_monitoring_location_id(v)
   1396 elif k == "properties":
   1397     # `",".join(properties)` would iterate a bare string as characters.
   1398     args[k] = [v] if isinstance(v, str) else _normalize_str_iterable(v, k)

File /opt/hostedtoolcache/Python/3.13.13/x64/lib/python3.13/site-packages/dataretrieval/waterdata/utils.py:1351, in _check_monitoring_location_id(monitoring_location_id)
   1349     return None
   1350 for item in (value,) if isinstance(value, str) else value:
-> 1351     _check_id_format(item)
   1352 return value

File /opt/hostedtoolcache/Python/3.13.13/x64/lib/python3.13/site-packages/dataretrieval/waterdata/utils.py:1358, in _check_id_format(value)
   1356 """Raise ``ValueError`` if ``value`` is not in ``AGENCY-ID`` format."""
   1357 if not _MONITORING_LOCATION_ID_RE.fullmatch(value):
-> 1358     raise ValueError(
   1359         f"Invalid monitoring_location_id: {value!r}. "
   1360         f"Expected 'AGENCY-ID' format, e.g., 'USGS-01646500'."
   1361     )

ValueError: Invalid monitoring_location_id: '04024430'. Expected 'AGENCY-ID' format, e.g., 'USGS-01646500'.

The following example is the same as the previous example but with multi index turned off (multi_index=False)

[13]:
discharge_multisite = waterdata.get_continuous(
    monitoring_location_id=["04024430", "04024000"],
    parameter_code=parameterCode,
    time="2013-10-01/2013-10-01",

)
print("Retrieved " + str(len(discharge_multisite[0])) + " data values.")
display(discharge_multisite[0])
---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
Cell In[13], line 1
----> 1 discharge_multisite = waterdata.get_continuous(
      2     monitoring_location_id=["04024430", "04024000"],
      3     parameter_code=parameterCode,
      4     time="2013-10-01/2013-10-01",

File /opt/hostedtoolcache/Python/3.13.13/x64/lib/python3.13/site-packages/dataretrieval/waterdata/api.py:426, in get_continuous(monitoring_location_id, parameter_code, statistic_id, properties, time_series_id, continuous_id, approval_status, unit_of_measure, qualifier, value, last_modified, time, limit, filter, filter_lang, convert_type)
    423 output_id = "continuous_id"
    425 # Build argument dictionary, omitting None values
--> 426 args = _get_args(locals())
    428 return get_ogc_data(args, output_id, service)

File /opt/hostedtoolcache/Python/3.13.13/x64/lib/python3.13/site-packages/dataretrieval/waterdata/utils.py:1395, in _get_args(local_vars, exclude)
   1393     continue
   1394 if k == "monitoring_location_id":
-> 1395     args[k] = _check_monitoring_location_id(v)
   1396 elif k == "properties":
   1397     # `",".join(properties)` would iterate a bare string as characters.
   1398     args[k] = [v] if isinstance(v, str) else _normalize_str_iterable(v, k)

File /opt/hostedtoolcache/Python/3.13.13/x64/lib/python3.13/site-packages/dataretrieval/waterdata/utils.py:1351, in _check_monitoring_location_id(monitoring_location_id)
   1349     return None
   1350 for item in (value,) if isinstance(value, str) else value:
-> 1351     _check_id_format(item)
   1352 return value

File /opt/hostedtoolcache/Python/3.13.13/x64/lib/python3.13/site-packages/dataretrieval/waterdata/utils.py:1358, in _check_id_format(value)
   1356 """Raise ``ValueError`` if ``value`` is not in ``AGENCY-ID`` format."""
   1357 if not _MONITORING_LOCATION_ID_RE.fullmatch(value):
-> 1358     raise ValueError(
   1359         f"Invalid monitoring_location_id: {value!r}. "
   1360         f"Expected 'AGENCY-ID' format, e.g., 'USGS-01646500'."
   1361     )

ValueError: Invalid monitoring_location_id: '04024430'. Expected 'AGENCY-ID' format, e.g., 'USGS-01646500'.