Visualization of Streamflow Conditions at Streamgages

This notebook provides a demonstration of the use of hyswap python package for calculating streamflow percentiles and then visualizing streamflow conditions at multiple streamflow gages.

This example notebook relies on use of the dataretrieval package for downloading streamflow information from USGS NWIS as well as the geopandas package for mapping functionality.

[1]:

# Run commented lines below to install geopandas and mapping dependencies from within the notebook
#import sys
#!{sys.executable} -m pip install geopandas folium mapclassify

[2]:

from dataretrieval import nwis
import hyswap
import numpy as np
import pandas as pd
from datetime import datetime, timedelta
from zoneinfo import ZoneInfo

from tqdm import tqdm # used for progress bar indicators
import geopandas # has dependencies of folium and mapclassify to create maps within this notebook
import warnings
warnings.filterwarnings('ignore') # ignore warnings from dataretrieval

NOTE: The tqdm package is used in for-loops in this notebook to show a data download progress bar, which may be informative to the user. The specification below (disable_tdqm) determines whether this progress bar is displayed when the notebook renders. It is set to True when rendering the notebook in the hyswap GitHub documentation site. To see the progress bars in this notebook, set disable_tqdm=False.

[3]:

disable_tqdm=True

Define Helper Functions

The hyswap package provides functionality for calculating non-interpretive streamflow statistics but does not provide functionality for correcting invalid data or geospatial capabilities for mapping. Here we setup some simple helper functions we can re-use throughout the notebook to QAQC data and create maps.

[4]:

# Data QAQC function for provisional NWIS data
def qaqc_nwis_data(df, data_column_name):
    #replace invalid -999999 values with NA
    df[data_column_name] = df[data_column_name].replace(-999999, np.nan)
    # add any additional QAQC steps needed
    return df

[5]:

def create_gage_condition_map(gage_df, flow_data_col, map_schema, streamflow_data_type):
        # Format date and set to str type for use in map tooltips
        if flow_data_col == '00060':
                gage_df['Date'] = gage_df['datetime'].dt.strftime('%Y-%m-%d %H:%M')
        elif flow_data_col == '00060_Mean':
                gage_df['Date'] = gage_df['datetime'].dt.strftime('%Y-%m-%d')
        gage_df = gage_df.drop('datetime', axis=1)
        # create colormap for map from hyswap schema
        schema = hyswap.utils.retrieve_schema(map_schema)
        flow_cond_cmap = schema['colors']
        if 'low_color' in schema:
                flow_cond_cmap = [schema['low_color']] + flow_cond_cmap
        if 'high_color' in schema:
                flow_cond_cmap = flow_cond_cmap + [schema['high_color']]
        # if creating a drought map, set handling of non-drought flows
        if map_schema in ['WaterWatch_Drought', 'NIDIS_Drought']:
                gage_df['flow_cat'] = gage_df['flow_cat'].cat.add_categories('Other')
                gage_df.loc[gage_df['flow_cat'].isnull(), 'flow_cat'] = 'Other'
                flow_cond_cmap = flow_cond_cmap + ['#e3e0ca'] # light taupe
        # set NA values to "Not Ranked" category
        gage_df['flow_cat'] = gage_df['flow_cat'].cat.add_categories('Not Ranked')
        gage_df.loc[gage_df['est_pct'].isna(), 'flow_cat'] = 'Not Ranked'
        flow_cond_cmap = flow_cond_cmap + ['#d3d3d3'] # light grey
        # renaming columns with user friendly names for map
        gage_df = gage_df.rename(columns={flow_data_col:'Discharge (cfs)',
                                                'est_pct':'Estimated Percentile',
                                                'site_no':'USGS Gage ID',
                                                'station_nm':'Streamgage Name',
                                                'flow_cat':'Streamflow Category'})
        # convert dataframe to geopandas GeoDataFrame
        gage_df = geopandas.GeoDataFrame(gage_df,
                             geometry=geopandas.points_from_xy(gage_df.dec_long_va,
                                                               gage_df.dec_lat_va),
                             crs="EPSG:4326").to_crs("EPSG:5070")
        # Create map
        m = gage_df.explore(column="Streamflow Category",
                                cmap=flow_cond_cmap,
                                tooltip=["USGS Gage ID", "Streamgage Name", "Streamflow Category", "Discharge (cfs)", "Estimated Percentile", "Date"],
                                tiles="CartoDB Positron",
                                marker_kwds=dict(radius=5),
                                legend_kwds=dict(caption=streamflow_data_type + '<br> Streamflow  Category'))
        return m #returns a folium map object

Data Downloading and Processing

Utilize an example state to select streamgages for generating various flow condition maps. Certain past days selected in the notebook are relevant to using the state of Vermont (VT) as an example, but the notebook can be run for any state. Next, find all stream sites active in the last year within the state.

[6]:

#| tbl-cap: List of streamgage sites active within the last week
state = 'VT'
# Query NWIS for what streamgage sites were active within the last week
sites, _ = nwis.what_sites(stateCd=state, parameterCd='00060', period="P1W", siteType='ST')
display(sites)

	agency_cd	site_no	station_nm	site_tp_cd	dec_lat_va	dec_long_va	coord_acy_cd	dec_coord_datum_cd	alt_va	alt_acy_va	alt_datum_cd	huc_cd	geometry
0	USGS	01133000	EAST BRANCH PASSUMPSIC RIVER NEAR EAST HAVEN, VT	ST	44.633942	-71.897594	S	NAD83	943.34	0.21	NAVD88	1080102	POINT (-71.89759 44.63394)
1	USGS	01134500	MOOSE RIVER AT VICTORY, VT	ST	44.511723	-71.837314	S	NAD83	1103.46	0.16	NAVD88	1080102	POINT (-71.83731 44.51172)
2	USGS	01135100	POPE BROOK TRIBUTARY (W-9), NR NORTH DANVILLE, VT	ST	44.490611	-72.161767	S	NAD83	1720.57	0.08	NAVD88	1080102	POINT (-72.16177 44.49061)
3	USGS	01135150	POPE BROOK (SITE W-3) NEAR NORTH DANVILLE, VT	ST	44.476167	-72.124543	S	NAD83	1141.03	0.14	NAVD88	1080102	POINT (-72.12454 44.47617)
4	USGS	01135300	SLEEPERS RIVER (SITE W-5) NEAR ST. JOHNSBURY, VT	ST	44.435335	-72.038429	S	NAD83	641.27	0.01	NAVD88	1080102	POINT (-72.03843 44.43534)
5	USGS	01135500	PASSUMPSIC RIVER AT PASSUMPSIC, VT	ST	44.365615	-72.039261	S	NAD83	487.79	0.12	NAVD88	1080102	POINT (-72.03926 44.36561)
6	USGS	01138500	CONNECTICUT RIVER AT WELLS RIVER, VT	ST	44.153397	-72.041758	S	NAD83	399.37	0.01	NAVD88	1080101	POINT (-72.04176 44.1534)
7	USGS	01139000	WELLS RIVER AT WELLS RIVER, VT	ST	44.150341	-72.065091	S	NAD83	505.20	0.12	NAVD88	1080103	POINT (-72.06509 44.15034)
8	USGS	01139800	EAST ORANGE BRANCH AT EAST ORANGE, VT	ST	44.092842	-72.335653	S	NAD83	1177.59	0.07	NAVD88	1080103	POINT (-72.33565 44.09284)
9	USGS	01141500	OMPOMPANOOSUC RIVER AT UNION VILLAGE, VT	ST	43.790086	-72.254932	S	NAD83	404.64	0.17	NAVD88	1080103	POINT (-72.25493 43.79009)
10	USGS	01142500	AYERS BROOK AT RANDOLPH, VT	ST	43.934510	-72.657882	S	NAD83	630.13	0.13	NAVD88	1080105	POINT (-72.65788 43.93451)
11	USGS	01144000	WHITE RIVER AT WEST HARTFORD, VT	ST	43.714236	-72.418149	S	NAD83	374.32	0.16	NAVD88	1080105	POINT (-72.41815 43.71424)
12	USGS	01150900	OTTAUQUECHEE RIVER NEAR WEST BRIDGEWATER, VT	ST	43.622291	-72.758990	S	NAD83	1148.59	0.10	NAVD88	1080106	POINT (-72.75899 43.62229)
13	USGS	01151500	OTTAUQUECHEE RIVER AT NORTH HARTLAND, VT	ST	43.602571	-72.354258	S	NAD83	350.27	0.16	NAVD88	1080106	POINT (-72.35426 43.60257)
14	USGS	01153000	BLACK RIVER AT NORTH SPRINGFIELD, VT	ST	43.333407	-72.514813	S	NAD83	445.36	0.01	NAVD88	1080106	POINT (-72.51481 43.33341)
15	USGS	01153550	WILLIAMS RIVER NEAR ROCKINGHAM VT	ST	43.191743	-72.485089	S	NAD83	303.26	0.01	NAVD88	1080107	POINT (-72.48509 43.19174)
16	USGS	01155500	WEST RIVER AT JAMAICA, VT	ST	43.108967	-72.775374	S	NAD83	660.63	0.09	NAVD88	1080107	POINT (-72.77537 43.10897)
17	USGS	01155910	WEST RIVER BELOW TOWNSHEND DAM NEAR TOWNSHEND, VT	ST	43.051190	-72.700094	S	NAD83	452.11	0.01	NAVD88	1080107	POINT (-72.70009 43.05119)
18	USGS	01334000	WALLOOMSAC RIVER NEAR NORTH BENNINGTON, VT	ST	42.912856	-73.256498	S	NAD83	525.44	0.01	NAVD88	2020003	POINT (-73.2565 42.91286)
19	USGS	04280000	POULTNEY RIVER BELOW FAIR HAVEN, VT	ST	43.624257	-73.311636	S	NAD83	102.37	0.10	NAVD88	4300101	POINT (-73.31164 43.62426)
20	USGS	04282000	OTTER CREEK AT CENTER RUTLAND, VT	ST	43.603701	-73.013293	S	NAD83	474.35	0.01	NAVD88	4300102	POINT (-73.01329 43.6037)
21	USGS	04282500	OTTER CREEK AT MIDDLEBURY, VT	ST	44.013143	-73.168018	S	NAD83	335.36	0.16	NAVD88	4300102	POINT (-73.16802 44.01314)
22	USGS	04282525	NEW HAVEN RIVER @ BROOKSVILLE, NR MIDDLEBURY, VT	ST	44.061754	-73.170796	S	NAD83	239.29	0.10	NAVD88	4300102	POINT (-73.1708 44.06175)
23	USGS	04282538	LEMON FAIR R BLW SHACKSBORO RD, NR SHOREHAM, VT	ST	43.911111	-73.275311	S	NAD83	163.14	0.09	NAVD88	4300102	POINT (-73.27531 43.91111)
24	USGS	04282581	EAST BR DEAD CREEK NEAR BRIDPORT, VT	ST	44.028611	-73.320639	5	NAD83	99.48	0.09	NAVD88	4300102	POINT (-73.32064 44.02861)
25	USGS	04282586	WEST BR DEAD CREEK NEAR ADDISON, VT	ST	44.049167	-73.354167	5	NAD83	100.56	0.07	NAVD88	4300102	POINT (-73.35417 44.04917)
26	USGS	04282629	LITTLE OTTER CK AT MONKTON RD, NR FERRISBURGH, VT	ST	44.185725	-73.184869	5	NAD83	236.62	0.06	NAVD88	4300108	POINT (-73.18487 44.18572)
27	USGS	04282650	LITTLE OTTER CREEK AT FERRISBURG, VT	ST	44.198142	-73.249131	S	NAD83	146.81	0.10	NAVD88	4300108	POINT (-73.24913 44.19814)
28	USGS	04282780	LEWIS CREEK AT NORTH FERRISBURG, VT	ST	44.249253	-73.228575	S	NAD83	117.60	0.10	NAVD88	4300108	POINT (-73.22857 44.24925)
29	USGS	04282795	LAPLATTE RIVER AT SHELBURNE FALLS, VT	ST	44.370051	-73.216237	S	NAD83	147.40	0.11	NAVD88	4300108	POINT (-73.21624 44.37005)
30	USGS	04285500	NORTH BRANCH WINOOSKI RIVER AT WRIGHTSVILLE, VT	ST	44.299503	-72.578721	S	NAD83	549.04	0.08	NAVD88	4300103	POINT (-72.57872 44.2995)
31	USGS	04286000	WINOOSKI RIVER AT MONTPELIER, VT	ST	44.256726	-72.593443	S	NAD83	499.87	0.16	NAVD88	4300103	POINT (-72.59344 44.25673)
32	USGS	04287000	DOG RIVER AT NORTHFIELD FALLS, VT	ST	44.182561	-72.640665	S	NAD83	602.42	0.16	NAVD88	4300103	POINT (-72.64067 44.18256)
33	USGS	04288000	MAD RIVER NEAR MORETOWN, VT	ST	44.277280	-72.742616	S	NAD83	543.67	0.16	NAVD88	4300103	POINT (-72.74262 44.27728)
34	USGS	04288225	W BRANCH LITTLE R ABV BINGHAM FALLS NEAR STOWE...	ST	44.523386	-72.771788	S	NAD83	1357.94	0.09	NAVD88	4300103	POINT (-72.77179 44.52339)
35	USGS	04288230	RANCH BROOK AT RANCH CAMP, NEAR STOWE, VT	ST	44.503941	-72.781788	S	NAD83	1296.26	0.14	NAVD88	4300103	POINT (-72.78179 44.50394)
36	USGS	04288295	LITTLE RIVER NEAR STOWE, VT	ST	44.442500	-72.724444	5	NAD83	608.06	0.01	NAVD88	4300103	POINT (-72.72444 44.4425)
37	USGS	04289000	LITTLE RIVER NEAR WATERBURY, VT	ST	44.370333	-72.768730	S	NAD83	427.62	0.09	NAVD88	4300103	POINT (-72.76873 44.37033)
38	USGS	04290500	WINOOSKI RIVER NEAR ESSEX JUNCTION, VT	ST	44.478939	-73.138738	S	NAD83	191.17	0.10	NAVD88	4300103	POINT (-73.13874 44.47894)
39	USGS	04292000	LAMOILLE RIVER AT JOHNSON, VT	ST	44.622861	-72.676331	S	NAD83	506.10	0.16	NAVD88	4300105	POINT (-72.67633 44.62286)
40	USGS	04292500	LAMOILLE RIVER AT EAST GEORGIA, VT	ST	44.679251	-73.072733	S	NAD83	288.90	0.10	NAVD88	4300105	POINT (-73.07273 44.67925)
41	USGS	04292750	MILL RIVER AT GEORGIA SHORE RD, NR ST ALBANS, VT	ST	44.779806	-73.143846	S	NAD83	117.95	0.16	NAVD88	4300108	POINT (-73.14385 44.77981)
42	USGS	04292810	JEWETT BROOK AT VT 38, NEAR ST. ALBANS, VT	ST	44.856111	-73.150833	S	NAD83	111.53	0.09	NAVD88	4300108	POINT (-73.15083 44.85611)
43	USGS	04293000	MISSISQUOI RIVER NEAR NORTH TROY, VT	ST	44.972859	-72.385485	S	NAD83	581.38	0.13	NAVD88	4300107	POINT (-72.38549 44.97286)
44	USGS	04293500	MISSISQUOI RIVER NEAR EAST BERKSHIRE, VT	ST	44.960082	-72.696607	S	NAD83	402.51	0.10	NAVD88	4300107	POINT (-72.69661 44.96008)
45	USGS	04294000	MISSISQUOI RIVER AT SWANTON, VT	ST	44.916750	-73.128567	S	NAD83	106.90	0.05	NAVD88	4300107	POINT (-73.12857 44.91675)
46	USGS	04294140	ROCK RIVER NEAR HIGHGATE CENTER, VT	ST	44.963056	-72.991944	S	NAD83	229.47	0.09	NAVD88	4300108	POINT (-72.99194 44.96306)
47	USGS	04294300	PIKE RIVER AT EAST FRANKLIN, NR ENOSBURG FALLS...	ST	45.002860	-72.833557	S	NAD83	397.54	0.13	NAVD88	4300108	POINT (-72.83356 45.00286)
48	USGS	04296000	BLACK RIVER AT COVENTRY, VT	ST	44.868936	-72.270104	S	NAD83	709.74	0.10	NAVD88	4150500	POINT (-72.2701 44.86894)
49	USGS	04296280	BARTON RIVER NEAR COVENTRY, VT	ST	44.871389	-72.200556	S	NAD83	679.69	0.10	NAVD88	4150500	POINT (-72.20056 44.87139)
50	USGS	04296500	CLYDE RIVER AT NEWPORT, VT	ST	44.940324	-72.189268	S	NAD83	681.82	0.09	NAVD88	4150500	POINT (-72.18927 44.94032)

Retrieve Streamflow Data from NWIS

For the sites identified above, download all historical daily streamflow data (1900 through 2023 Water Years).

[7]:

# create a python dictionary of dataframes by site id number
flow_data = {}

for StaID in tqdm(sites['site_no'], disable=disable_tqdm, desc="Downloading NWIS Flow Data for Sites"):
    flow_data[StaID] = nwis.get_record(sites=StaID, parameterCd='00060', start="1900-01-01", end="2023-10-01", service='dv')

Calculate Variable Streamflow Percentile Thresholds

For the sites identified above, calculate streamflow percentile thresholds at 0, 1, 5, 10, … , 90, 95, 99, 100 percentiles.

Note that when using the default settings of calculate_fixed_percentile_threshold() it is common for NA values to be returned for the highest/lowest percentile thresholds such as 1 and 99. This is because a very long streamflow record (100+ years) is required to have sufficient observations to calculate the 99th or 1st percentile of streamflow for a given day when using the default settings of method=weibull with mask_out_of_range=True.

[8]:

# Define what percentile levels (thresholds) that we want to calculate.
# Intervals of 5 or less are recommended to have sufficient levels to interpolate between in later calculations.
# Note that 0 and 100 percentile levels are ignored, refer to min and max values returned instead.
percentile_levels = np.concatenate((np.array([1]), np.arange(5,96,5), np.array([99])), axis=0)
print(percentile_levels)

[ 1  5 10 15 20 25 30 35 40 45 50 55 60 65 70 75 80 85 90 95 99]

[9]:

percentile_values = {}
for StaID in tqdm(sites['site_no'], disable=disable_tqdm, desc="Processing Sites"):
    if '00060_Mean' in flow_data[StaID].columns:
        # Filter data as only approved data in NWIS should be used to calculate statistics
        df = hyswap.utils.filter_approved_data(flow_data[StaID], '00060_Mean_cd')
        percentile_values[StaID] = hyswap.percentiles.calculate_variable_percentile_thresholds_by_day(
            df, '00060_Mean', percentiles=percentile_levels)
    else:
        print('No standard discharge data column found for site ' + StaID + ', skipping')

No standard discharge data column found for site 01155910, skipping
No standard discharge data column found for site 04282538, skipping

[10]:

#| tbl-cap: Sample of calcualted variable streamflow percentile thresholds for first site in list
display(percentile_values[list(percentile_values.keys())[0]].head())

	min	p01	p05	p10	p15	p20	p25	p30	p35	p40	...	p80	p85	p90	p95	p99	max	mean	count	start_yr	end_yr
month_day
01-01	29.0	NaN	31.14	33.2	41.2	44.6	46.0	50.4	57.6	63.2	...	110.0	130.8	145.0	169.0	NaN	378.0	85.3	63	1940	2023
01-02	28.0	NaN	31.0	34.2	40.2	44.0	46.0	49.6	57.2	60.6	...	120.2	127.0	144.2	178.4	NaN	350.0	84.32	63	1940	2023
01-03	27.0	NaN	30.2	35.4	40.0	43.6	45.0	48.4	54.2	61.6	...	108.4	118.2	133.0	205.0	NaN	300.0	81.88	63	1940	2023
01-04	28.0	NaN	30.2	35.8	40.64	43.8	45.0	51.6	60.32	66.0	...	104.6	114.0	130.4	184.4	NaN	220.0	78.39	63	1940	2023
01-05	26.0	NaN	29.2	35.8	40.34	42.8	45.0	52.4	59.2	66.76	...	94.2	103.8	118.0	160.0	NaN	425.0	79.83	63	1940	2023

5 rows × 27 columns

Create a Current Flow Conditions Map for Daily Mean Streamflow

Retrieve most recent (yesterday) daily mean streamflow

Download data from NWIS and calculate corresponding streamflow percentile for the most recent daily mean discharge

[11]:

yesterday = datetime.strftime(datetime.now(tz=ZoneInfo("US/Eastern")) - timedelta(1), '%Y-%m-%d')
recent_dvs = nwis.get_record(sites=sites['site_no'].tolist(), parameterCd='00060', start=yesterday, end=yesterday, service='dv')
recent_dvs = qaqc_nwis_data(recent_dvs, '00060_Mean')

Categorize streamflow based on calculated percentile values

Calculate estimated streamflow percentile for the new data by interpolating against the previously calculated percentile threshold levels.

[12]:

# estimate percentiles
df = pd.DataFrame()
for StaID, site_df in recent_dvs.groupby(level="site_no", group_keys=False):
    if StaID in list(percentile_values.keys()):
        if not percentile_values[StaID].isnull().all().all():
            percentiles = hyswap.percentiles.calculate_multiple_variable_percentiles_from_values(
            site_df,'00060_Mean', percentile_values[StaID])
            df = pd.concat([df, percentiles])
# categorize streamflow by the estimated streamflow percentiles
df = hyswap.utils.categorize_flows(df, 'est_pct', schema_name="NWD")
df = df.reset_index(level='datetime')
# Prep Data for mapping by joining site information and flow data
gage_df = pd.merge(sites, df, how="right", on="site_no")

Create Map of Streamflow Conditions

[13]:

#| fig-cap: Map showing most recent daily mean streamflow and corresponding flow conditions
map = create_gage_condition_map(gage_df, '00060_Mean', 'NWD', 'Current Daily Mean')
display(map)

Make this Notebook Trusted to load map: File -> Trust Notebook

Create Map of Streamflow Conditions using Alternative Categorization Schema

[14]:

#| fig-cap: Map showing most recent daily mean streamflow and corresponding flow conditions using a brown-blue schema

# Prep Data for mapping by joining site information and flow data
map = create_gage_condition_map(gage_df, '00060_Mean', 'WaterWatch_BrownBlue', 'Current Daily Mean')
display(map)

Make this Notebook Trusted to load map: File -> Trust Notebook

Create a “Real-Time” Flow Conditions Map for Instantaneous Streamflow

Retrieve most recent instantaneous streamflow records

Download data from NWIS and calculate corresponding streamflow percentile for the most recent instantaneous discharge measurement

[15]:

recent_ivs = nwis.get_record(sites=sites['site_no'].tolist(), parameterCd='00060', service='iv')
recent_ivs = qaqc_nwis_data(recent_ivs, '00060')

Categorize streamflow based on calculated percentile values

Calculate estimated streamflow percentile for the new instantaneous data by interpolating against the previously calculated percentile threshold levels from daily streamflow records.

[16]:

# estimate percentiles
df = pd.DataFrame()
for StaID, site_df in recent_ivs.groupby(level="site_no", group_keys=False):
    if StaID in list(percentile_values.keys()):
        if not percentile_values[StaID].isnull().all().all():
            percentiles = hyswap.percentiles.calculate_multiple_variable_percentiles_from_values(
            site_df,'00060', percentile_values[StaID])
            df = pd.concat([df, percentiles])
# categorize streamflow by the estimated streamflow percentiles
df = hyswap.utils.categorize_flows(df, 'est_pct', schema_name="NWD")
df = df.tz_convert(tz='US/Eastern')
df = df.reset_index(level='datetime')
# Prep Data for mapping by joining site information and flow data
gage_df = pd.merge(sites, df, how="right", on="site_no")

Create Map of Real-Time Streamflow Conditions

[17]:

#| fig-cap: Map showing most real-time streamflow conditions

map = create_gage_condition_map(gage_df, '00060', 'NWD', 'Real-Time Instantaneous')
display(map)

Make this Notebook Trusted to load map: File -> Trust Notebook

Create a Current Flow Conditions Map for n-Day Daily Streamflow

Retrieve daily streamflow records for past 7 days

Download data from NWIS and calculate corresponding streamflow percentiles for each day

[18]:

past_dvs = nwis.get_record(
    sites=sites['site_no'].tolist(),
    parameterCd='00060',
    start=datetime.strftime(datetime.now(tz=ZoneInfo("US/Eastern")) - timedelta(7), '%Y-%m-%d'),
    end=yesterday,
    service='dv',
    multi_index=False
)
past_dvs = qaqc_nwis_data(past_dvs, '00060_Mean')
past_dvs_7d = pd.DataFrame()
for StaID, new_df in past_dvs.groupby('site_no'):
    df = hyswap.utils.rolling_average(new_df, '00060_Mean', '7D').round(2)
    past_dvs_7d=pd.concat([past_dvs_7d, df], axis=0)
past_dvs_7d = past_dvs_7d.dropna()

Calculate 7-day average streamflow and corresponding variable percentile thresholds

[19]:

flow_data_7d = {}
for StaID in tqdm(sites['site_no'], disable=disable_tqdm):
    if '00060_Mean' in flow_data[StaID].columns:
        flow_data_7d[StaID] = hyswap.utils.rolling_average(flow_data[StaID], '00060_Mean', '7D').round(2)
    else:
        print('No standard discharge data column found for site ' + StaID + ', skipping')

No standard discharge data column found for site 01155910, skipping
No standard discharge data column found for site 04282538, skipping

[20]:

percentile_values_7d = {}
for StaID in tqdm(sites['site_no'], disable=disable_tqdm, desc="Processing"):
    if '00060_Mean' in flow_data[StaID].columns:
        # Filter data as only approved data in NWIS should be used to calculate statistics
        df = hyswap.utils.filter_approved_data(flow_data_7d[StaID], '00060_Mean_cd')
        percentile_values_7d[StaID] = hyswap.percentiles.calculate_variable_percentile_thresholds_by_day(
            df, '00060_Mean', percentiles=percentile_levels)
    else:
        print('No standard discharge data column found for site ' + StaID + ', skipping')

No standard discharge data column found for site 01155910, skipping
No standard discharge data column found for site 04282538, skipping

Categorize streamflow based on calculated percentile values

Calculate estimated streamflow percentile for the new data by interpolating against the previously calculated percentile threshold levels.

[21]:

# estimate percentiles
df = pd.DataFrame()
for StaID, site_df in past_dvs_7d.groupby("site_no"):
    if StaID in list(percentile_values_7d.keys()):
        if not percentile_values[StaID].isnull().all().all():
            month_day = site_df.index.strftime('%m-%d')[0]
            site_df['est_pct'] = hyswap.percentiles.calculate_variable_percentile_from_value(
            site_df['00060_Mean'][0], percentile_values[StaID], month_day)
            df = pd.concat([df, site_df])
# categorize streamflow by the estimated streamflow percentiles
df = hyswap.utils.categorize_flows(df, 'est_pct', schema_name="NWD")
# keep only most recent 7-day average flow for plotting
df = df[df.index.get_level_values('datetime') == yesterday]
df = df.reset_index(level='datetime')
# Prep Data for mapping by joining site information and flow data
gage_df = pd.merge(sites, df, how="right", on="site_no")

Create Map of 7-Day Average Streamflow Conditions

[22]:

#| fig-cap: Map showing most recent 7-day average streamflow and corresponding flow conditions

map = create_gage_condition_map(gage_df, '00060_Mean', 'NWD', 'Current 7-Day Average')
display(map)

Make this Notebook Trusted to load map: File -> Trust Notebook

Create a Drought Conditions Map for a Previous Day’s Streamflow

Retrieve daily streamflow records from a past day

Download data from NWIS and calculate corresponding streamflow percentiles for the given day’s streamflow

[23]:

past_day = "2023-05-30"

past_dvs = nwis.get_record(sites=sites['site_no'].tolist(),
                              parameterCd='00060',
                              start=past_day,
                              end=past_day,
                              service='dv')
past_dvs = qaqc_nwis_data(past_dvs, '00060_Mean')

Categorize streamflow based on calculated percentile values

[24]:

# Calculate estimated streamflow percentile for the new data by interpolating against
# the previously calculated percentile threshold levels
df = pd.DataFrame()
for StaID, site_df in past_dvs.groupby(level="site_no", group_keys=False):
    if StaID in list(percentile_values.keys()):
        if not percentile_values[StaID].isnull().all().all():
            percentiles = hyswap.percentiles.calculate_multiple_variable_percentiles_from_values(
            site_df,'00060_Mean', percentile_values[StaID])
            df = pd.concat([df, percentiles])
# categorize streamflow by the estimated streamflow percentiles
df = hyswap.utils.categorize_flows(df, 'est_pct', schema_name="WaterWatch_Drought")
df = df.reset_index(level='datetime')
# Prep Data for mapping by joining site information and flow data
gage_df = pd.merge(sites, df, how="right", on="site_no")

Create Map of Streamflow Drought Conditions

[25]:

#| fig-cap: Map showing historical daily mean streamflow and corresponding flow conditions using a drought categorization schema
map = create_gage_condition_map(gage_df, '00060_Mean', 'WaterWatch_Drought', 'Daily Mean')
display(map)

Make this Notebook Trusted to load map: File -> Trust Notebook

Create a Flood Conditions Map for a past Day’s Streamflow

This example uses fixed percentiles that are not calculated by day of year, but instead across all days of the year together. Flow categories are therefore relative to absolute streamflow levels rather than what is normal for that day of the year.

Retrieve daily streamflow records from a past day

Download data from NWIS and calculate corresponding fixed streamflow percentiles for the given day’s streamflow

[26]:

past_day = "2023-07-10"

past_dvs = nwis.get_record(sites=sites['site_no'].tolist(),
                           parameterCd='00060',
                           start=past_day,
                           end=past_day,
                           service='dv')
past_dvs = qaqc_nwis_data(past_dvs, '00060_Mean')

[27]:

fixed_percentile_values = {}

for StaID in tqdm(sites['site_no'], disable=disable_tqdm):
    if '00060_Mean' in flow_data[StaID].columns:
        # Filter data as only approved data in NWIS should be used to calculate statistics
        df = hyswap.utils.filter_approved_data(flow_data[StaID], '00060_Mean_cd')
        if not df.empty:
            fixed_percentile_values[StaID] = hyswap.percentiles.calculate_fixed_percentile_thresholds(
                df['00060_Mean'], percentiles=percentile_levels)
        else:
            print(StaID + ' has no approved data, skipping')
    else:
        print(StaID + ' does not have standard discharge data column, skipping')

01155910 does not have standard discharge data column, skipping
04282538 does not have standard discharge data column, skipping

Categorize streamflow based on calculated percentile values

[28]:

# estimate percentiles
for StaID in past_dvs.index.get_level_values(0):
    if StaID in list(fixed_percentile_values.keys()):
        past_dvs.at[(StaID, past_day), 'est_pct'] = hyswap.percentiles.calculate_fixed_percentile_from_value(
            past_dvs.at[(StaID, past_day), '00060_Mean'], fixed_percentile_values[StaID])
# categorize streamflow by the estimated streamflow percentiles
df = hyswap.utils.categorize_flows(past_dvs, 'est_pct', schema_name="WaterWatch_Flood")
df = df.reset_index(level='datetime')
# Prep Data for mapping by joining site information and flow data
gage_df = pd.merge(sites, df, how="right", on="site_no")

Create Map of Streamflow High-Flow Conditions

[29]:

#| fig-cap: Map showing historical daily mean streamflow and corresponding flow conditions using a high-flow categorization schema
map = create_gage_condition_map(gage_df, '00060_Mean', 'WaterWatch_Flood', 'Daily Mean')
display(map)

Make this Notebook Trusted to load map: File -> Trust Notebook