Analysis of Streamflow N-Day Averages for a Single Gage

Demonstration of use of hyswap python package for analyzing different n-day average flows for a single streamgage.

This example notebook relies on use of the dataretrieval package for downloading streamflow information from USGS NWIS.

What do we mean by n-day average streamflow?

N-day averages are calculated on a rolling basis for each day in the dataset by averaging streamflow across the focal day streamflow value and the daily streamflow from the preceding (n-1) days. For example, if we wanted to calculate 7-day average streamflow for an 8-day long set of streamflow measurements, all averages for days 1 through 6 would be NaNs (not enough daily values preceding days 1-6 to calculate a mean value), but day 7 and day 8 would have mean streamflow values. Following this calculation, every non-NaN day in the streamflow dataset represents an n-day average. These averages are then used to estimate and plot percentiles. See the hyswap.rolling_average documentation for more information.

[1]:
import dataretrieval as nwis
import hyswap
import matplotlib.pyplot as plt
import numpy as np
import warnings
warnings.filterwarnings('ignore') # ignore warnings from dataretrieval

Download streamflow data from USGS NWIS for an example site

For demonstration purposes, use gage 04286000 - WINOOSKI RIVER AT MONTPELIER, VT

Users can identify streamflow gage locations and site ID numbers through the USGS National Water Dashboard

[2]:
StaID = '04286000'
flow_data = nwis.get_record(sites=StaID, parameterCd='00060', start='1900-01-01', service='dv')
station_name = nwis.get_record(sites=StaID, service='site').loc[0, 'station_nm']
[3]:
if '00060_Mean' in flow_data.columns:
    # set preliminary, non-valid observations (-999999) to NaN
    flow_data['00060_Mean'] = flow_data['00060_Mean'].replace(-999999, np.nan)

    # create a filtered version of data of only USGS approved data (quality-assured data that may be published and used in statistical analsysis)
    approved_flow_data = hyswap.utils.filter_approved_data(flow_data, '00060_Mean_cd')
else:
    print('No standard discharge data column found for site ' + StaID + ', suggest using a different site')
[4]:
# summary statistics for the approved data (quality-assured data that may be published and used in statistical analsysis)
summary_stats = hyswap.calculate_summary_statistics(approved_flow_data)
[5]:
#| tbl-cap: Example table of summary statistics
summary_stats
[5]:
Summary Statistics
Site number 04286000
Begin date 1914-07-01
End date 2023-12-05
Count 38197
Minimum 17.0
Mean 623.35
Median 354.0
Maximum 17900.0

Check data and plot simple hydrograph of 7-day average flows

[6]:
#| fig-cap: Sample HySwap hydrograph plot for 7-Day Average Streamflow
# calculate n-Day average streamflow
window = '7D'
flow_data_nday = hyswap.utils.rolling_average(flow_data, '00060_Mean', window)
plot_start = "2021-10-01"
plot_end = "2023-09-30"
# make plot
fig, ax = plt.subplots(figsize=(10,4))
ax = hyswap.plot_hydrograph(flow_data_nday,
                       data_column_name = "00060_Mean",
                       start_date=plot_start,
                       end_date=plot_end,
                       title=f'7-Day Average Streamflow Hydrograph for {StaID} - {station_name}',
                       ylab="7-Day Average Streamflow (cfs)",
                       ax = ax)
plt.show()
../_images/examples_single_gage_n_day_flows_analysis_9_0.png

Look at long-term record of n-day streamflow as raster hydrograph

[7]:
# try a raster hydrograph to see historical n-day flows
# format the data from the last 50 years of records (1972-2023)
df_formatted = hyswap.rasterhydrograph.format_data(flow_data,
                                                   '00060_Mean',
                                                   year_type="calendar",
                                                   begin_year=1973,
                                                   end_year=2023,
                                                   window_width = "7-day")
[8]:
#| fig-cap: Sample raster hydrograph showing past 50 years of 7-day average streamflow

# make plot
fig = plt.figure(figsize=(8,12))
ax = fig.add_subplot()
ax = hyswap.plots.plot_raster_hydrograph(
    df_formatted,
    ax=ax,
    title=f"7-Day Average Raster Hydrograph for {StaID} - {station_name}",
    cbarlab='7-Day Average Streamflow, cubic feet per second')
plt.show()
../_images/examples_single_gage_n_day_flows_analysis_12_0.png

Calculate streamflow percentiles for n-day flows

This example focuses on how the year 2023 compares to other years on record for this site. First, we will view how daily streamflow in 2023 compares to other years, and then compare 7-day and 28-day rolling means.

[9]:
# get year/doy information
df_indexed = hyswap.utils.define_year_doy_columns(flow_data,
                                               year_type='water',
                                               clip_leap_day=True)
year_to_plot = 2023
# filter down to data from year to plot
df_year = df_indexed[df_indexed['index_year'] == year_to_plot].copy()
# calculate variable percentile thresholds (using only approved data)
percentiles_by_day = hyswap.percentiles.calculate_variable_percentile_thresholds_by_day(
    approved_flow_data, "00060_Mean")
[10]:
#| fig-cap: Example streamflow duration hydrograph with daily streamflow for the 2023 Water Year
# plot daily percentiles
fig, ax = plt.subplots(figsize=(12, 8))
ax = hyswap.plots.plot_duration_hydrograph(
    percentiles_by_day,
    df_year,
    "00060_Mean",
    ax=ax,
    data_label=f"Water Year {year_to_plot}",
    title=f"Streamflow Duration Hydrograph for {StaID} - {station_name}",
    xlab="",
    ylab="Daily Streamflow (cfs)"
)
plt.show()
../_images/examples_single_gage_n_day_flows_analysis_15_0.png
[11]:
#| fig-cap: Example streamflow duration hydrograph with 7-day average streamflow for the 2023 Water Year
percentiles_by_nday = hyswap.percentiles.calculate_variable_percentile_thresholds_by_day(
    approved_flow_data, "00060_Mean", window_width='7-day')
window = '7D'
# plot 7-day avg percentiles
fig, ax = plt.subplots(figsize=(12, 8))
ax = hyswap.plots.plot_duration_hydrograph(
    percentiles_by_nday,
    hyswap.utils.rolling_average(df_year, "00060_Mean", window),
    "00060_Mean",
    ax=ax,
    data_label=f"Water Year {year_to_plot}",
    title=f"7-Day Avg. Streamflow Duration Hydrograph for {StaID} - {station_name}",
    xlab="",
    ylab="7-Day Avg. Streamflow (cfs)"
)
plt.show()
../_images/examples_single_gage_n_day_flows_analysis_16_0.png
[12]:
#| fig-cap: Example streamflow duration hydrograph with 28-day average streamflow for the 2023 Water Year
percentiles_by_nday = hyswap.percentiles.calculate_variable_percentile_thresholds_by_day(
    approved_flow_data, "00060_Mean", window_width='28-day')
window = '28D'
# plot 28-day average percentiles
fig, ax = plt.subplots(figsize=(12, 8))
ax = hyswap.plots.plot_duration_hydrograph(
    percentiles_by_nday,
    hyswap.utils.rolling_average(df_year, "00060_Mean", window),
    "00060_Mean",
    ax=ax,
    data_label=f"Water Year {year_to_plot}",
    title=f"28-Day Avg. Streamflow Duration Hydrograph for {StaID} - {station_name}",
    xlab="",
    ylab="28-Day Avg. Streamflow (cfs)"
)
plt.show()
../_images/examples_single_gage_n_day_flows_analysis_17_0.png