USGS dataretrieval Python Package get_discharge_measurements() Examples

This notebook provides examples of using the Python dataretrieval package to retrieve surface water discharge measurement data for a United States Geological Survey (USGS) monitoring site. The dataretrieval package provides a collection of functions to get data from the USGS National Water Information System (NWIS) and other online sources of hydrology and water quality data, including the United States Environmental Protection Agency (USEPA).

Install the Package

Use the following code to install the package if it doesn’t exist already within your Jupyter Python environment.

[1]:
!pip install dataretrieval
Defaulting to user installation because normal site-packages is not writeable
Requirement already satisfied: dataretrieval in /home/runner/.local/lib/python3.10/site-packages (0.1.dev1+g3ba0c83)
Requirement already satisfied: requests in /home/runner/.local/lib/python3.10/site-packages (from dataretrieval) (2.32.3)
Requirement already satisfied: pandas==2.* in /home/runner/.local/lib/python3.10/site-packages (from dataretrieval) (2.2.3)
Requirement already satisfied: numpy>=1.22.4 in /home/runner/.local/lib/python3.10/site-packages (from pandas==2.*->dataretrieval) (2.1.2)
Requirement already satisfied: python-dateutil>=2.8.2 in /home/runner/.local/lib/python3.10/site-packages (from pandas==2.*->dataretrieval) (2.9.0.post0)
Requirement already satisfied: pytz>=2020.1 in /usr/lib/python3/dist-packages (from pandas==2.*->dataretrieval) (2022.1)
Requirement already satisfied: tzdata>=2022.7 in /home/runner/.local/lib/python3.10/site-packages (from pandas==2.*->dataretrieval) (2024.2)
Requirement already satisfied: charset-normalizer<4,>=2 in /home/runner/.local/lib/python3.10/site-packages (from requests->dataretrieval) (3.4.0)
Requirement already satisfied: idna<4,>=2.5 in /usr/lib/python3/dist-packages (from requests->dataretrieval) (3.3)
Requirement already satisfied: urllib3<3,>=1.21.1 in /usr/lib/python3/dist-packages (from requests->dataretrieval) (1.26.5)
Requirement already satisfied: certifi>=2017.4.17 in /usr/lib/python3/dist-packages (from requests->dataretrieval) (2020.6.20)
Requirement already satisfied: six>=1.5 in /usr/lib/python3/dist-packages (from python-dateutil>=2.8.2->pandas==2.*->dataretrieval) (1.16.0)

Load the package so you can use it along with other packages used in this notebook.

[2]:
from dataretrieval import nwis
from IPython.display import display

Basic Usage

The dataretrieval package has several functions that allow you to retrieve data from different web services. This examples uses the get_discharge_measurements() function to retrieve surface water discharge measurements for a USGS monitoring site from NWIS. The function has the following arguments:

Arguments (Additional arguments, if supplied, will be used as query parameters)

  • sites (list of strings): A list of USGS site codes to retrieve data for. If the qwdata parameter site_no is supplied, it will overwrite the sites parameter.

  • start (string): The beginning date of a period for which to retrieve measurements. If the qwdata parameter begin_date is supplied, it will overwrite the start parameter.

  • end (string): The ending date of a period for which to retrieve measurements. If the qwdata parameter end_date is supplied, it will overwrite the end parameter.

Example 1: Get all of the surface water measurements for a single site

[3]:
measurements1 = nwis.get_discharge_measurements(sites="10109000")
print("Retrieved " + str(len(measurements1[0])) + " data values.")
Retrieved 946 data values.

Interpreting the Result

The result of calling the get_discharge_measurements() function is an object that contains a Pandas data frame object and an associated metadata object. The Pandas data frame contains the discharge measurements for the time period requested.

Once you’ve got the data frame, there’s several useful things you can do to explore the data.

Display the data frame as a table

[4]:
display(measurements1[0])
agency_cd site_no measurement_nu measurement_dt tz_cd q_meas_used_fg party_nm site_visit_coll_agency_cd gage_height_va discharge_va measured_rating_diff gage_va_change gage_va_time control_type_cd discharge_cd
0 USGS 10109000 146 1951-09-06 09:40:00 MST Yes AFP USGS 0.94 17.0 Excellent 0.00 0.8 Clear NONE
1 USGS 10109000 148 1951-12-13 12:10:00 MST Yes BSR USGS 0.88 12.5 Unspecified 0.00 0.7 Clear NONE
2 USGS 10109000 151 1952-04-23 10:25:00 MST Yes AFP USGS 1.80 203.0 Good -0.01 1.1 Clear NONE
3 USGS 10109000 152 1952-05-22 01:35:00 MST Yes BSR USGS 2.54 506.0 Excellent 0.00 0.2 Clear NONE
4 USGS 10109000 153 1952-05-27 09:45:00 MST Yes WNJ USGS 2.80 652.0 Good -0.03 1.0 Clear NONE
... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ...
941 USGS 10109000 1091 2024-06-07 09:35:00 MDT Yes EHQM USGS 4.67 1060.0 Good 0.00 0.6 Clear NONE
942 USGS 10109000 1092 2024-07-02 08:35:04 MDT Yes EHQM USGS 3.41 379.0 Good 0.00 0.3 Clear NONE
943 USGS 10109000 1093 2024-08-07 13:52:00 MDT Yes EHQM USGS 2.90 194.0 Good 0.00 0.3 Clear NONE
944 USGS 10109000 1094 2024-09-11 13:34:04 MDT Yes EHQM USGS 2.67 119.0 Fair 0.00 0.5 Clear NONE
945 USGS 10109000 1095 2024-09-11 14:19:00 MDT Yes EHQM USGS 2.67 119.0 Good 0.02 0.3 Clear NONE

946 rows × 15 columns

Show the data types of the columns in the resulting data frame.

[5]:
print(measurements1[0].dtypes)
agency_cd                     object
site_no                       object
measurement_nu                object
measurement_dt                object
tz_cd                         object
q_meas_used_fg                object
party_nm                      object
site_visit_coll_agency_cd     object
gage_height_va               float64
discharge_va                 float64
measured_rating_diff          object
gage_va_change               float64
gage_va_time                 float64
control_type_cd               object
discharge_cd                  object
dtype: object

The other part of the result returned from the get_discharge_measurements() function is a metadata object that contains information about the query that was executed to return the data. For example, you can access the URL that was assembled to retrieve the requested data from the USGS web service. The USGS web service responses contain a descriptive header that defines and can be helpful in interpreting the contents of the response.

[6]:
print("The query URL used to retrieve the data from NWIS was: " + measurements1[1].url)
The query URL used to retrieve the data from NWIS was: https://nwis.waterdata.usgs.gov/nwis/measurements?site_no=10109000&format=rdb

Additional Examples

Example 2: Get all of the surface water measurements between a start and end date

[7]:
measurements2 = nwis.get_discharge_measurements(sites="10109000", start="2019-01-01", end="2019-12-31")
print("Retrieved " + str(len(measurements2[0])) + " data values.")
display(measurements2[0])
Retrieved 9 data values.
agency_cd site_no measurement_nu measurement_dt tz_cd q_meas_used_fg party_nm site_visit_coll_agency_cd gage_height_va discharge_va measured_rating_diff gage_va_change gage_va_time control_type_cd discharge_cd
0 USGS 10109000 1034 2019-01-29 12:57:30 MST Yes MJF USGS 2.43 83.8 Good -0.04 0.55 Clear NONE
1 USGS 10109000 1035 2019-03-11 13:29:00 MDT Yes MJF USGS 2.46 94.2 Good 0.00 0.55 Clear NONE
2 USGS 10109000 1036 2019-04-23 15:39:06 MDT Yes MJF USGS 3.29 337.0 Good 0.00 0.33 Clear NONE
3 USGS 10109000 1037 2019-06-12 08:02:06 MDT Yes MJF USGS 4.09 709.0 Good 0.00 1.02 Clear NONE
4 USGS 10109000 1038 2019-07-30 11:57:07 MDT Yes MJF/BTR USGS 2.84 167.0 Good 0.00 0.67 VegetationLight NONE
5 USGS 10109000 1039 2019-07-30 12:06:00 MDT Yes MJF/BTR USGS 2.84 162.0 Good 0.00 0.83 VegetationLight NONE
6 USGS 10109000 1040 2019-09-16 10:59:09 MDT Yes MJF USGS 2.64 126.0 Good 0.03 0.75 VegetationLight NONE
7 USGS 10109000 1041 2019-10-28 13:06:02 MDT Yes MJF/NML USGS 2.64 133.0 Good 0.00 0.65 Clear NONE
8 USGS 10109000 1042 2019-12-03 13:53:15 MST Yes NML USGS 2.60 122.0 Good 0.02 0.60 Clear NONE

Example 3: Get all of the surface water measurements for multiple sites

[8]:
measurements3 = nwis.get_discharge_measurements(sites=["01594440", "040851325"])
print("Retrieved " + str(len(measurements3[0])) + " data values.")
display(measurements3[0])
Retrieved 482 data values.
agency_cd site_no measurement_nu measurement_dt tz_cd q_meas_used_fg party_nm site_visit_coll_agency_cd gage_height_va discharge_va measured_rating_diff gage_va_change gage_va_time control_type_cd discharge_cd
0 USGS 01594440 1 1955-04-07 14:05:00 EST Yes JMD USGS 3.11 152.00 Good 0.00 1.5 Clear NONE
1 USGS 01594440 2 1955-05-04 16:05:00 EST Yes DGB USGS 2.76 127.00 Fair -0.01 1.0 Clear NONE
2 USGS 01594440 3 1955-06-10 08:35:00 EST Yes DGB USGS 4.26 310.00 Good -0.14 1.9 Clear NONE
3 USGS 01594440 4 1955-07-21 12:15:00 EST Yes JMD USGS 1.83 46.60 Good 0.00 0.8 Clear NONE
4 USGS 01594440 5 1955-09-08 17:15:00 EST Yes AGT USGS 3.25 175.00 Poor 0.00 1.3 Clear NONE
... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ...
477 USGS 040851325 93 2014-05-20 11:42:00 CDT Yes PCR USGS 2.78 34.50 Unspecified -0.03 0.6 NaN NONE
478 USGS 040851325 94 2014-06-25 16:27:00 CDT Yes DLO,ENC USGS 2.26 11.40 Good 0.01 0.4 Clear NONE
479 USGS 040851325 95 2014-08-13 16:23:00 CDT Yes DLO USGS 1.91 0.82 Fair 0.00 0.4 VegetationModerate NONE
480 USGS 040851325 96 2014-09-24 16:26:00 CDT Yes DLO USGS 2.23 8.90 Good 0.02 0.4 VegetationModerate NONE
481 USGS 040851325 97 2016-07-20 08:25:00 CDT Yes AJD USGS NaN 0.68 Fair NaN 0.5 NaN NONE

482 rows × 15 columns