USGS dataretrieval Python Package what_sites()
Examples
This notebook provides examples of using the Python dataretrieval package to search NWIS for sites within a region with specific data. The dataretrieval package provides a collection of functions to get data from the USGS National Water Information System (NWIS) and other online sources of hydrology and water quality data, including the United States Environmental Protection Agency (USEPA).
Install the Package
Use the following code to install the package if it doesn’t exist already within your Jupyter Python environment.
[1]:
!pip install dataretrieval
Defaulting to user installation because normal site-packages is not writeable
Requirement already satisfied: dataretrieval in /home/runner/.local/lib/python3.10/site-packages (0.1.dev1+g3ba0c83)
Requirement already satisfied: requests in /home/runner/.local/lib/python3.10/site-packages (from dataretrieval) (2.32.3)
Requirement already satisfied: pandas==2.* in /home/runner/.local/lib/python3.10/site-packages (from dataretrieval) (2.2.3)
Requirement already satisfied: numpy>=1.22.4 in /home/runner/.local/lib/python3.10/site-packages (from pandas==2.*->dataretrieval) (2.1.2)
Requirement already satisfied: python-dateutil>=2.8.2 in /home/runner/.local/lib/python3.10/site-packages (from pandas==2.*->dataretrieval) (2.9.0.post0)
Requirement already satisfied: pytz>=2020.1 in /usr/lib/python3/dist-packages (from pandas==2.*->dataretrieval) (2022.1)
Requirement already satisfied: tzdata>=2022.7 in /home/runner/.local/lib/python3.10/site-packages (from pandas==2.*->dataretrieval) (2024.2)
Requirement already satisfied: charset-normalizer<4,>=2 in /home/runner/.local/lib/python3.10/site-packages (from requests->dataretrieval) (3.4.0)
Requirement already satisfied: idna<4,>=2.5 in /usr/lib/python3/dist-packages (from requests->dataretrieval) (3.3)
Requirement already satisfied: urllib3<3,>=1.21.1 in /usr/lib/python3/dist-packages (from requests->dataretrieval) (1.26.5)
Requirement already satisfied: certifi>=2017.4.17 in /usr/lib/python3/dist-packages (from requests->dataretrieval) (2020.6.20)
Requirement already satisfied: six>=1.5 in /usr/lib/python3/dist-packages (from python-dateutil>=2.8.2->pandas==2.*->dataretrieval) (1.16.0)
Load the package so you can use it along with other packages used in this notebook.
[2]:
from dataretrieval import nwis
from IPython.display import display
Basic Usage
The dataretrieval package has several functions that allow you to retrieve data from different web services. This examples uses the what_sites()
function to search NWIS for sites within a region with specific data. The function has several arguments, depending on the result you want to retrieve.
Note: Must specify one major argument.
Major Arguments (Additional arguments, if supplied, will be used as query parameters)
sites (string or list): A list of site numbers. Sites may be prefixed with an optional agency code followed by a colon.
stateCd (string): U.S. postal service (2-digit) state code. Only 1 state can be specified per request.
huc (string or list): A list of hydrologic unit codes (HUC) or aggregated watersheds. Only 1 major HUC can be specified per request, or up to 10 minor HUCs. A major HUC has two digits.
bBox (list): A contiguous range of decimal latitude and longitude, starting with the west longitude, then the south latitude, then the east longitude, and then the north latitude with each value separated by a comma. The product of the range of latitude range and longitude cannot exceed 25 degrees. Whole or decimal degrees must be specified, up to six digits of precision. Minutes and seconds are not allowed.
countyCd (string or list): A list of county numbers, in a 5 digit numeric format. The first two digits of a county’s code are the FIPS State Code. (url: https://help.waterdata.usgs.gov/code/county_query?fmt=html)
Minor Arguments
startDt (string): Selects sites based on whether data was collected at a point in time beginning after startDt (start date). Dates must be in ISO-8601 Calendar Date format (for example: 1990-01-01).
endDt (string)
period (string): Selects sites based on whether or not they were active between now and a time in the past. For example, period=P10W will select sites active in the last ten weeks.
modifiedSince (string): Returns only sites where site attributes or period of record data have changed during the request period.
parameterCd (string or list): Returns only site data for those sites containing the requested USGS parameter codes.
siteType (string or list): Restricts sites to those having one or more major and/or minor site types, such as stream, spring or well. For a list of all valid site types see https://help.waterdata.usgs.gov/site_tp_cd. For example, siteType=’ST’ returns streams only.
Formatting Parameters
siteOutput (string ‘basic’ or ‘expanded’): Indicates the richness of metadata you want for site attributes. Note that for visually oriented formats like Google Map format, this argument has no meaning. Note: for performance reasons, siteOutput=’expanded’ cannot be used if seriesCatalogOutput=true or with any values for outputDataTypeCd.
seriesCatalogOutput (boolean): A switch that provides detailed period of record information for certain output formats. The period of record indicates date ranges for a certain kind of information about a site, for example the start and end dates for a site’s daily mean streamflow.
For additional parameter options see https://waterservices.usgs.gov/docs/site-service/site-service-details
Example 1: Retrieve information about sites in Ohio where phosphorus data was collected
[3]:
siteListPhos = nwis.what_sites(stateCd="OH", parameterCd="00665")
Interpreting the Result
The result of calling the what_sites()
function is an object that contains a Pandas data frame object and an associated metadata object. The Pandas data frame contains the requestes site inventory data.
Once you’ve got the data frame, there’s several useful things you can do to explore the data.
[4]:
# Display the data frame as a table
display(siteListPhos[0])
agency_cd | site_no | station_nm | site_tp_cd | dec_lat_va | dec_long_va | coord_acy_cd | dec_coord_datum_cd | alt_va | alt_acy_va | alt_datum_cd | huc_cd | |
---|---|---|---|---|---|---|---|---|---|---|---|---|
0 | USGS | 03086500 | Mahoning River at Alliance OH | ST | 40.932836 | -81.094541 | S | NAD83 | 1034.79 | 0.10 | NAVD88 | 5030103.0 |
1 | USGS | 03089500 | Mill Creek near Berlin Center OH | ST | 41.000336 | -80.968424 | S | NAD83 | 1032.90 | 0.01 | NGVD29 | 5030103.0 |
2 | USGS | 03090500 | Mahoning River bl Berlin Dam nr Berlin Center OH | ST | 41.048391 | -81.001203 | S | NAD83 | 957.72 | 0.01 | NAVD88 | 5030103.0 |
3 | USGS | 03091500 | Mahoning River at Pricetown OH | ST | 41.131446 | -80.971202 | S | NAD83 | 904.77 | 0.10 | NAVD88 | 5030103.0 |
4 | USGS | 03092000 | Kale Creek near Pricetown OH | ST | 41.139779 | -80.995092 | S | NAD83 | 914.70 | 0.01 | COE1912 | 5030103.0 |
... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... |
1275 | USGS | 414144084242500 | WM-103 OH | GW | 41.686439 | -84.406893 | S | NAD83 | 850.00 | 10.00 | NGVD29 | 4100006.0 |
1276 | USGS | 414150084331000 | WM-87-S14 OH | GW | 41.697272 | -84.552728 | S | NAD83 | 895.00 | 10.00 | NGVD29 | 4100003.0 |
1277 | USGS | 414214083151000 | Lake Erie at site WE12 near Toledo OH | LK | NaN | NaN | S | NaN | NaN | NaN | NaN | 4120200.0 |
1278 | USGS | 414233083595500 | Little Bear Creek at HWY 120 nr Seward, OH | ST-DCH | 41.709167 | -83.998611 | S | NAD83 | NaN | NaN | NaN | 4100002.0 |
1279 | USGS | 414937083112500 | Lake Erie at site WE4 near Toledo OH | LK | NaN | NaN | S | NaN | NaN | NaN | NaN | 4120200.0 |
1280 rows × 12 columns
The other part of the result returned from the what_sites()
function is a metadata object that contains information about the query that was executed to return the data. For example, you can access the URL that was assembled to retrieve the requested data from the USGS web service. The USGS web service responses contain a descriptive header that defines and can be helpful in interpreting the contents of the response.
[5]:
print('The query URL used to retrieve the data from NWIS was: ' + siteListPhos[1].url)
The query URL used to retrieve the data from NWIS was: https://waterservices.usgs.gov/nwis/site?stateCd=OH¶meterCd=00665&format=rdb
Additional Examples
Example 2: Retrieve site information for a single site
[6]:
oneSite = nwis.what_sites(sites='05114000')
display(oneSite[0])
agency_cd | site_no | station_nm | site_tp_cd | dec_lat_va | dec_long_va | coord_acy_cd | dec_coord_datum_cd | alt_va | alt_acy_va | alt_datum_cd | huc_cd | |
---|---|---|---|---|---|---|---|---|---|---|---|---|
0 | USGS | 05114000 | SOURIS RIVER NEAR SHERWOOD, ND | ST | 48.989957 | -101.958335 | F | NAD83 | 1605.0 | 0.19 | NAVD88 | 9010008 |
Example 3: Retrieve site information for a single site and show the result with expanded output
[7]:
oneSite = nwis.what_sites(sites='05114000', siteOutput='expanded')
display(oneSite[0])
agency_cd | site_no | station_nm | site_tp_cd | lat_va | long_va | dec_lat_va | dec_long_va | coord_meth_cd | coord_acy_cd | ... | local_time_fg | reliability_cd | gw_file_cd | nat_aqfr_cd | aqfr_cd | aqfr_type_cd | well_depth_va | hole_depth_va | depth_src_cd | project_no | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
0 | USGS | 05114000 | SOURIS RIVER NEAR SHERWOOD, ND | ST | 485924 | 1015728 | 48.989957 | -101.958335 | M | F | ... | Y | NaN | NNNNNNNN | NaN | NaN | NaN | NaN | NaN | NaN | NaN |
1 rows × 42 columns
Example 4: Retrieve site information for sites in Utah with daily values data falling within a specified date range
[8]:
UTsites = nwis.what_sites(stateCd='UT', outputDataTypeCd='dv', startDT='1971-07-01', endDT='2021-07-28')
display(UTsites[0])
agency_cd | site_no | station_nm | site_tp_cd | dec_lat_va | dec_long_va | coord_acy_cd | dec_coord_datum_cd | alt_va | alt_acy_va | ... | stat_cd | ts_id | loc_web_ds | medium_grp_cd | parm_grp_cd | srs_id | access_cd | begin_date | end_date | count_nu | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
0 | USGS | 09163675 | COTTONWOOD WASH AT I-70, NEAR CISCO, UTAH | ST | 39.081652 | -109.217615 | F | NAD83 | NaN | NaN | ... | 3 | 142731 | NaN | wat | NaN | 1645423 | 0 | 1983-04-13 | 1986-09-30 | 1267 |
1 | USGS | 09180000 | DOLORES RIVER NEAR CISCO, UT | ST | 38.797208 | -109.195114 | F | NAD83 | 4168.32 | 0.11 | ... | 1 | 241055 | NaN | wat | NaN | 1645597 | 0 | 2006-07-18 | 2024-10-25 | 6603 |
2 | USGS | 09180000 | DOLORES RIVER NEAR CISCO, UT | ST | 38.797208 | -109.195114 | F | NAD83 | 4168.32 | 0.11 | ... | 2 | 241056 | NaN | wat | NaN | 1645597 | 0 | 2006-07-19 | 2024-10-25 | 6586 |
3 | USGS | 09180000 | DOLORES RIVER NEAR CISCO, UT | ST | 38.797208 | -109.195114 | F | NAD83 | 4168.32 | 0.11 | ... | 3 | 142732 | NaN | wat | NaN | 1645597 | 0 | 2006-07-19 | 2024-10-24 | 6579 |
4 | USGS | 09180000 | DOLORES RIVER NEAR CISCO, UT | ST | 38.797208 | -109.195114 | F | NAD83 | 4168.32 | 0.11 | ... | 11 | 142733 | NaN | wat | NaN | 1645597 | 0 | 1949-05-01 | 2004-08-17 | 13002 |
... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... |
1214 | USGS | 414411112543701 | (B-12- 9)30cda- 1 | GW | 41.736312 | -112.911094 | S | NAD83 | 4239.00 | 0.50 | ... | 2 | 144019 | NaN | wat | NaN | 1642008 | 0 | 1975-10-05 | 2024-10-07 | 9114 |
1215 | USGS | 414500112000000 | COM FLOW BEAR AREA GSL INFLOW GROUP 1 | ST | 41.749928 | -112.000782 | F | NAD83 | NaN | NaN | ... | 3 | 144020 | NaN | wat | NaN | 1645423 | 0 | 1960-10-01 | 1982-09-29 | 7303 |
1216 | USGS | 414500112000100 | COM FLOW BEAR AREA GSL INFLOW GROUP 2 | ST | 41.749928 | -112.001059 | F | NAD83 | NaN | NaN | ... | 3 | 144021 | NaN | wat | NaN | 1645423 | 0 | 1960-10-01 | 1980-09-29 | 6207 |
1217 | USGS | 414500112000200 | COM FLOW BEAR AREA GSL INFLOW GROUP 3 | ST | 41.749928 | -112.001337 | F | NAD83 | NaN | NaN | ... | 3 | 144022 | NaN | wat | NaN | 1645423 | 0 | 1960-10-01 | 1980-09-29 | 6573 |
1218 | USGS | 415703112514501 | (B-14- 9) 9add- 1 | GW | 41.960106 | -112.863444 | 1 | NAD83 | 4387.88 | 0.10 | ... | 2 | 144023 | NaN | wat | NaN | 1642008 | 0 | 1981-07-21 | 2024-10-07 | 14872 |
1219 rows × 24 columns
Example 5: Retrieve site information for a single site and show the series catalog output
The series catalog output is a list of the parameters that have been collected at that site
[9]:
oneSite = nwis.what_sites(sites='05114000', seriesCatalogOutput='true')
display(oneSite[0])
agency_cd | site_no | station_nm | site_tp_cd | dec_lat_va | dec_long_va | coord_acy_cd | dec_coord_datum_cd | alt_va | alt_acy_va | ... | stat_cd | ts_id | loc_web_ds | medium_grp_cd | parm_grp_cd | srs_id | access_cd | begin_date | end_date | count_nu | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
0 | USGS | 05114000 | SOURIS RIVER NEAR SHERWOOD, ND | ST | 48.989957 | -101.958335 | F | NAD83 | 1605.0 | 0.19 | ... | NaN | 0 | NaN | wat | NaN | 0 | 0 | 2006 | 2024 | 19 |
1 | USGS | 05114000 | SOURIS RIVER NEAR SHERWOOD, ND | ST | 48.989957 | -101.958335 | F | NAD83 | 1605.0 | 0.19 | ... | 1.0 | 91355 | NaN | wat | NaN | 1645597 | 0 | 1983-08-11 | 2023-10-09 | 5903 |
2 | USGS | 05114000 | SOURIS RIVER NEAR SHERWOOD, ND | ST | 48.989957 | -101.958335 | F | NAD83 | 1605.0 | 0.19 | ... | 2.0 | 91356 | NaN | wat | NaN | 1645597 | 0 | 1983-08-11 | 2023-10-09 | 5903 |
3 | USGS | 05114000 | SOURIS RIVER NEAR SHERWOOD, ND | ST | 48.989957 | -101.958335 | F | NAD83 | 1605.0 | 0.19 | ... | 3.0 | 91357 | NaN | wat | NaN | 1645597 | 0 | 1983-08-11 | 2023-10-09 | 5903 |
4 | USGS | 05114000 | SOURIS RIVER NEAR SHERWOOD, ND | ST | 48.989957 | -101.958335 | F | NAD83 | 1605.0 | 0.19 | ... | 11.0 | 91358 | NaN | wat | NaN | 1645597 | 0 | 1974-10-17 | 1981-09-02 | 1937 |
... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... |
402 | USGS | 05114000 | SOURIS RIVER NEAR SHERWOOD, ND | ST | 48.989957 | -101.958335 | F | NAD83 | 1605.0 | 0.19 | ... | NaN | 92591 | NaN | wat | NaN | 1645423 | 0 | 1994-10-01 | 2024-10-25 | 10982 |
403 | USGS | 05114000 | SOURIS RIVER NEAR SHERWOOD, ND | ST | 48.989957 | -101.958335 | F | NAD83 | 1605.0 | 0.19 | ... | NaN | 92592 | NaN | wat | NaN | 17164583 | 0 | 2007-10-01 | 2024-10-25 | 6234 |
404 | USGS | 05114000 | SOURIS RIVER NEAR SHERWOOD, ND | ST | 48.989957 | -101.958335 | F | NAD83 | 1605.0 | 0.19 | ... | NaN | 249682 | NaN | wat | NaN | 1646694 | 0 | 2019-05-15 | 2023-10-10 | 1609 |
405 | USGS | 05114000 | SOURIS RIVER NEAR SHERWOOD, ND | ST | 48.989957 | -101.958335 | F | NAD83 | 1605.0 | 0.19 | ... | NaN | 249681 | NaN | wat | NaN | 1736457 | 0 | 2019-05-15 | 2023-10-10 | 1609 |
406 | USGS | 05114000 | SOURIS RIVER NEAR SHERWOOD, ND | ST | 48.989957 | -101.958335 | F | NAD83 | 1605.0 | 0.19 | ... | NaN | 317832 | NaN | wat | NaN | 1642503 | 0 | 2024-01-01 | 2024-10-25 | 298 |
407 rows × 24 columns