Analysis of Streamflow Conditions and Historical Streamflows at a Single Gage
This notebook provides a demonstration of the use of hyswap python package for analyzing streamflow conditions at a single streamgage including plotting of flow duration curves, duration hydrographs, and cumulative streamflow.
This example notebook relies on use of the dataretrieval package for downloading streamflow information from USGS NWIS
[1]:
from dataretrieval import nwis
import hyswap
import matplotlib.pyplot as plt
import numpy as np
import warnings
warnings.filterwarnings('ignore') # ignore warnings from dataretrieval
Download streamflow data from USGS NWIS for an example site
For demonstration purposes, retrieving data for gage 04288000 - MAD RIVER AT MORETOWN, VT.
Users can identify streamflow gage locations and site ID numbers through the USGS National Water Dashboard
[2]:
StaID = '04288000'
flow_data = nwis.get_record(sites=StaID, parameterCd='00060', start='1900-01-01', service='dv')
station_name = nwis.get_record(sites=StaID, service='site').loc[0, 'station_nm']
[3]:
if '00060_Mean' in flow_data.columns:
# set preliminary, non-valid observations (-999999) to NaN
flow_data['00060_Mean'] = flow_data['00060_Mean'].replace(-999999, np.nan)
# create a filtered version of data of only USGS approved data (quality-assured data that may be published and used in statistical analsysis)
approved_flow_data = hyswap.utils.filter_approved_data(flow_data, '00060_Mean_cd')
else:
print('No standard discharge data column found for site ' + StaID + ', suggest using a different site')
[4]:
# view a sample of data records
flow_data.head()
[4]:
site_no | 00060_Mean | 00060_Mean_cd | |
---|---|---|---|
datetime | |||
1928-10-01 00:00:00+00:00 | 04288000 | 70.0 | A |
1928-10-02 00:00:00+00:00 | 04288000 | 70.0 | A |
1928-10-03 00:00:00+00:00 | 04288000 | 70.0 | A |
1928-10-04 00:00:00+00:00 | 04288000 | 70.0 | A |
1928-10-05 00:00:00+00:00 | 04288000 | 65.0 | A |
Perform simple data check and exploration by plotting hydrographs of select water years
[5]:
#| fig-cap: Sample HySwap hydrograph plot
plot_start = "2010-10-01"
plot_end = "2011-09-30"
fig, ax = plt.subplots(figsize=(9,4))
ax = hyswap.plot_hydrograph(flow_data,
data_column_name = "00060_Mean",
start_date = plot_start,
end_date = plot_end,
title = f"Hydrograph for {StaID} - {station_name}",
ax = ax)
plt.show()
Next, try changing the plot to not have log scale on the y-axis
[6]:
#| fig-cap: Sample HySwap hydrograph plot
fig, ax = plt.subplots(figsize=(9,4))
hyswap.plot_hydrograph(flow_data,
data_column_name = "00060_Mean",
start_date=plot_start,
end_date=plot_end,
title=f"Hydrograph for {StaID} - {station_name}",
yscale = 'linear',
ax = ax)
plt.show()
Now test doing some plot customizations including axis labels, colors, plot style
[7]:
#| fig-cap: Sample HySwap hydrograph plot with customized appearance
plot_start = "2021-10-01"
fig, ax = plt.subplots(figsize=(12,4))
ax = hyswap.plot_hydrograph(flow_data,
data_column_name = "00060_Mean",
start_date=plot_start,
title=f"Hydrograph for {StaID} - {station_name}",
yscale = 'linear',
ylab = 'Streamflow (cfs)',
xlab = '',
color = '#360919',
ax = ax)
ax.grid(which = "major", axis="y", lw = 1.5)
ax.grid(which = "minor", axis="y", linestyle = "dashed")
ax.minorticks_on()
plt.show()
Analyze the long-term record of streamflow
Review summary statistics and flow records using raster hydrographs
[8]:
# calculate summary statistics on only approved data (quality-assured data that may be published and used in statistical analsysis)
summary_stats = hyswap.calculate_summary_statistics(approved_flow_data)
[9]:
#| tbl-cap: Example table of summary statistics
summary_stats
[9]:
Summary Statistics | |
---|---|
Site number | 04288000 |
Begin date | 1928-10-01 |
End date | 2024-05-21 |
Count | 34932 |
Minimum | 2.9 |
Mean | 275.17 |
Median | 150.0 |
Maximum | 6840.0 |
To view long-term patterns and trends in streamflow, raster hydrographs can be plotted using hyswap
. Here, let’s utilize a raster hydrograph to see the historical flow at the site from the past 50 years (1973-2023)
[10]:
# format the data
df_formatted = hyswap.rasterhydrograph.format_data(flow_data, '00060_Mean', year_type="calendar", begin_year=1973, end_year=2023)
[11]:
#| fig-cap: Sample raster hydrograph showing past 50 years of streamflow
# make plot
fig, ax = plt.subplots(figsize=(8,10))
ax = hyswap.plots.plot_raster_hydrograph(
df_formatted,
ax=ax,
title=f"Raster Hydrograph for {StaID} - {station_name}")
plt.show()
Next, try customizing the raster hydrograph by changing to water years, using a different colormap, removing the leap day, and showing all years that have data.
[12]:
# re-format data
df_formatted = hyswap.rasterhydrograph.format_data(flow_data,
'00060_Mean',
year_type="water",
clip_leap_day=True)
[13]:
#| fig-cap: Sample raster hydrograph showing all daily streamflow records from a site
# make plot
fig = plt.figure(figsize=(8,18))
ax = fig.add_subplot()
ax = hyswap.plots.plot_raster_hydrograph(
df_formatted,
ax=ax,
title=f"Raster Hydrograph for {StaID} - {station_name}",
cmap='gist_ncar')
plt.show()
Analyze annual streamflow through cumulative flow and flow duration curves
Use calculations of streamflow percentiles (variable by day) to analyze annual patterns in streamflow
Let’s look at the last four years (2020-2023) of cumulative streamflow by plotting cumulative hydrographs of different water years along with the envelope that 90% of annual cumulative streamflow is expected to be within, and the years with the minimum and maximum annual cumulative streamflows.
[14]:
#| fig-cap: Sample cumulative hydrograph showing multiple water years of cumulative streamflow
years_to_plot = [2020, 2021, 2022, 2023]
# plot the cumulative streamflow hydrograph
fig, ax = plt.subplots(figsize=(10, 5))
ax = hyswap.plots.plot_cumulative_hydrograph(
flow_data,
data_column_name='00060_Mean',
target_years=years_to_plot,
ax=ax,
title=f'Cumulative Streamflow Hydrograph for {StaID} - {station_name}',
envelope_pct = [5,95],
max_year=True,
min_year=True,
year_type = 'water',
alpha=0.2)
plt.show()
Next, look at the flow duration curve for a site
[15]:
# create data for flow duration curve
# generate 1,000 evenly spaced values between the min and max flow to create a smooth line
values = np.linspace(flow_data['00060_Mean'].min(), flow_data['00060_Mean'].max(), 1000)
# calculate exceedance probabilities from the daily mean streamflow values using only approved data
exceedance_probabilities = hyswap.exceedance.calculate_exceedance_probability_from_values_multiple(
values, approved_flow_data['00060_Mean'])
[16]:
#| fig-cap: Example flow duration curve for a single gage
# plot the flow duration curve
fig, ax = plt.subplots(figsize=(8, 6))
ax = hyswap.plots.plot_flow_duration_curve(
values,
exceedance_probabilities,
ax=ax,
title=f'Flow Duration Curve for {StaID} - {station_name}')
# show the plot
plt.tight_layout()
plt.show()
Next, look at the seasonality and variability of streamflow at a single site by plotting a streamflow duration hydrograph.
[17]:
# create data for streamflow duration hydrograph from approved flow data
percentiles_by_day = hyswap.percentiles.calculate_variable_percentile_thresholds_by_day(approved_flow_data,
"00060_Mean")
[18]:
#| tbl-cap: Calculated streamflow percentile thresholds for a streamgage
# View a sample of this data
display(percentiles_by_day.head())
min | p05 | p10 | p25 | p50 | p75 | p90 | p95 | max | mean | count | start_yr | end_yr | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|
month_day | |||||||||||||
01-01 | 42.0 | 55.85 | 70.0 | 108.0 | 155.0 | 290.0 | 458.6 | 806.6 | 1780.0 | 246.36 | 96 | 1929 | 2024 |
01-02 | 45.0 | 53.7 | 70.0 | 105.0 | 152.5 | 289.75 | 480.8 | 699.35 | 1500.0 | 235.46 | 96 | 1929 | 2024 |
01-03 | 45.0 | 53.55 | 77.1 | 110.0 | 155.0 | 275.75 | 426.0 | 524.6 | 800.0 | 213.13 | 96 | 1929 | 2024 |
01-04 | 40.0 | 56.0 | 77.7 | 101.25 | 160.0 | 282.5 | 394.4 | 537.0 | 680.0 | 206.6 | 96 | 1929 | 2024 |
01-05 | 36.0 | 58.7 | 75.4 | 103.5 | 155.0 | 246.25 | 360.0 | 501.95 | 1470.0 | 211.66 | 96 | 1929 | 2024 |
[19]:
#| fig-cap: Example Streamflow Duration Hydrograph for a Streamgage
# plotting percentiles by day with line shade between
fig, ax = plt.subplots(figsize=(10, 6))
water_year_to_plot = 2022
# get year/doy information
df_indexed = hyswap.utils.define_year_doy_columns(flow_data,
year_type='water',
clip_leap_day=True)
# filter down to data from selected year
df_year = df_indexed[df_indexed['index_year'] == water_year_to_plot].copy()
# plot data
ax = hyswap.plots.plot_duration_hydrograph(
percentiles_by_day,
df_year,
"00060_Mean",
ax=ax,
data_label=f"Water Year {water_year_to_plot}",
title=f"Streamflow Duration Hydrograph for {StaID} - {station_name}"
)
plt.tight_layout()
plt.show()