The Preprocessing Module

sfrmaker.preprocessing.clip_flowlines_to_polygon(flowlines, polygon, crs=None, simplify_tol=100, logger=None)[source]

Clip line features in a flowlines DataFrame to polygon features in polygon.

Parameters:

flowlinesDataFrame

Output from preprocess_nhdplus()

crsobj

Coordinate reference system of flowlines. Only needed if the data do not have a valid ESRI projection (.prj) file. A Python int, dict, str, or pyproj.crs.CRS instance passed to pyproj.crs.CRS.from_user_input()

Can be any of:

PROJ string
Dictionary of PROJ parameters
PROJ keyword arguments for parameters
JSON string with PROJ parameters
CRS WKT string
An authority string [i.e. ‘epsg:4326’]
An EPSG integer code [i.e. 4326]
A tuple of (“auth_name”: “auth_code”) [i.e (‘epsg’, ‘4326’)]
An object with a to_wkt method.
A pyproj.crs.CRS class

By default, None

polygonstr

Polygon shapefile, shapely polygon or list of shapely polygons of model active area. Shapely polygons must be in the same CRS as flowlines; shapefiles will be automatically reprojected.

simplify_tolfloat

Simplification tolerance for polygon to speed clipping. See The Shapely User Manual for more details.

loggerLogger instance

Returns:

flcclipped flowlines dataframe

sfrmaker.preprocessing.cull_flowlines(NHDPlus_paths, active_area=None, asum_thresh=None, intermittent_streams_asum_thresh=None, cull_invalid=True, cull_isolated=True, keep_comids=None, outfolder='clipped_flowlines', logger=None)[source]

Cull NHDPlus version2 data to an area defined by an active_area polygon and to flowlines with Arbolate sums greater than specified thresholds. Also remove lines that are isolated from the stream network or are missing attribute information.

Parameters:

NHDPlus_pathslist of strings: Paths to the root folder level NHDPlus Drainage Basins, as they were downloaded from the NHDPlus website (e.g. NHDPlus04, NHDPlus07, etc.).
active_areastr, shapely polygon or tuple, optional: A polygon shapefile, or shapely polygon or bounding box tuple (left, bottom, top, right) in the NAD83 GCS (EPSG:4269). The active area is converted to a bounding box, which is then used to filter the flowlines that are read in. If none, no filtering is performed, and the whole area encompased by the input NHDPlus data will be retained. By default None.
asum_threshnumeric, optional: Minimum arbolate sum value (total length of upstream drainage) to retain. Any flowlines with arbolate sums less than this value will be dropped. By default None.
intermittent_streams_asum_threshnumeric, optional: Minimum arbolate sum value (total length of upstream drainage) to retain for flowlines coded as intermittent (FCODE == 46003). Any intermittent flowlines with arbolate sums less than this value will be dropped. By default None.
cull_invalidbool, optional: Option to cull flowlines that have incomplete attribute information (lacking entries in the PlusFlowVAA, PlusFlow or Elevslope tables), by default False.
cull_isolatedbool, optional: Culling intermittent streams with intermittent_streams_asum_thresh may result in some isolated flowlines that no longer have downstream connections, or such isolated flowlines may be present in the raw NHDPlus data. SFRmaker identifies isolated flowslines by looking at up to 10 downstream connections to lines that are not stream network. Option to drop these lines. By default, False.
keep_comidssequence: List-like of COMIDs to retain, regardless of culling criteria.
outfolderstr, optional: Location for writing output, by default ‘clipped_flowlines’
loggersfrmaker.logger instance, optional: Pass an existing sfrmaker.logger instance to logger the preprocessing operations, by default None

sfrmaker.preprocessing.edit_flowlines(flowlines, config_file, id_column='COMID', toid_column='tocomid', logger=None)[source]

Make edits to the flowlines in flowlines_file, as described in config_file.

Parameters:

flowlinesshapefile or DataFrame: Flowlines to edit. If a shapefile is specified, a backup with “.original” before the extension is made, and the input shapefile is overwritten by the results.
config_fileyaml file: e.g. ‘flowline_edits.yml’
id_columnstr: Column in flowlines with unique identifiers (e.g. COMIDs)
toid_columnstr: Column in flowslines with downstream routing connections (identifiers)
loggersfrmaker logger instance, optional: Pass a logger file instance to continue writing to an open logger file (e.g. after or before other operations)

Returns:

flowlinesDataFrame: Edited flowlines.

sfrmaker.preprocessing.fix_invalid_asums(asums, fl_lengths, graph, graph_r)[source]

Recompute arbolate sum at any places in the network where it decreases going downstream, and then for all lines downstream of those locations. Decreases may be caused by routing connections that weren’t in the original NHDPlus data.

Parameters:

asumsdict: Dictionary of {indentifier (e.g. comid): asum values}, in same units as fl_lengths. Includes all lines in the stream network.
fl_lengthsdict: Dictionary of flowline lengths {indentifier (e.g. comid): length values}, in same units as asums.
graphdict: Dictionary of downstream routing connections {fromcomid: tocomid}
graph_rdict: Dictionary of upstream routing connections {tocomid: {fromcomid1, fromcomid2,…}}

Returns:

new_asumsdict: Dictionary of recomputed arbolate sums {comid: asum value}

Notes

This function is only designed to fix instances of breaks in arbolate sum caused by new routing connections. It can’t fix all instances of invalid asums, because it is not known whether the arbolate sums from upstream tributaries reflect unique upstream drainages (if the tribs are distributaries coming from the same divergence, there asums will reflect the same upstream drainage, and therefore would be duplicative if summed).

sfrmaker.preprocessing.get_flowline_routing(NHDPlus_paths=None, PlusFlow=None, mask=None, mask_crs=None, nhdplus_crs=4269)[source]

Read a collection of NHDPlus version 2 PlusFlow (routing) tables from one or more drainage basins and consolidate into a single pandas DataFrame, returning the FROMCOMID and TOCOMID columns.

Parameters:

NHDPlus_pathssequence

Sequence of paths to the top level folder for each drainage basin. For example:

['NHDPlus/NHDPlusGL/NHDPlus04', 
 'NHDPlus/NHDPlusMS/NHDPlus07']

by default None

PlusFlowstring or sequence

Single path to a PlusFlow table or sequence of PlusFlow table filepaths, by default None

Returns:

flowline_routingDataFrame: [description]

Raises:

ValueError: [description]

sfrmaker.preprocessing.get_netcdf_crs_from_grid_mapping(grid_mapping)[source]

sfrmaker.preprocessing.preprocess_nhdplus(flowlines_file, pfvaa_file, pf_file, elevslope_file, demfile=None, run_zonal_statistics=True, dem_length_units='meters', flowline_elevations_file=None, active_area=None, narwidth_shapefile=None, waterbody_shapefiles=None, buffersize_meters=50, asum_thresh=0.0, known_connections=None, update_up_elevations=None, update_dn_elevations=None, width_from_asum_a_param=0.1193, width_from_asum_b_param=0.5032, minimum_width=1.0, output_length_units='meters', logger=None, outfolder='output/', flowline_crs=None, dest_crs=None)[source]

Preprocess NHDPlus data to a single DataFrame of flowlines that each route to no more than one flowline, with width, elevation and recomputed arbolate sum attributes. A key part of the preprocessing is handling divergences in the stream network, as described in more detail in the Notes section. In picking routing at divergences, values are sampled from the demfile and included in the output DataFrame. Optionally (via the narwidth_shapefile arguement), remote sensing-based width estimates from the NARWidth Database (Allen and Pavelsky, 2015) can be included.

Parameters:

flowlines_filestr

Path to NHDPlus NHDFlowline shapefile. May or maybe not have been preprocessed by cull_flowlines(). The flowlines must be in a valid projected coorinate reference system (CRS; i.e., with units of meters), or a valid projected CRS must be specified with flowline_crs.

pfvaa_filestr

Path to NHDPlus PlusFlowlineVAA database (.dbf file). May or maybe not have been preprocessed by cull_flowlines(). ArbolateSu values within this file are assumed to be in km.

pf_filestr

Path to NHDPlus PlusFlow database (.dbf file). May or maybe not have been preprocessed by cull_flowlines()

elevslope_filestr

Path to NHDPlus elevslope database (.dbf file). May or maybe not have been preprocessed by cull_flowlines()

demfilestr

Path to DEM raster for project area.

dem_length_unitsstr, any length unit; e.g. {‘m’, ‘meters’, ‘ft’, etc.}

Length units of values in demfile. By default, ‘meters’.

active_areastr, optional

Polygon shapefile of active area that preprocessed lines will be clipped to. By default, None, in which case the flowlines won’t be clipped.

narwidth_shapefilestr, optional

Path to shapefile from the NARWidth database (Allen and Pavelsky, 2015).

waterbody_shapefilesstr or list of strings, optional

Path(s) to NHDPlus NHDWaterbody shapefile(s). Only required if a narwidth_shapefile is specified.

buffersize_metersfloat

Buffer distance in meters around flowlines to include when sampling DEM. By default, 50.

asum_threshfloat

Arbolate sum threshold for culling minor distributaries (that are not the main channel) below divergences. In NHDPlus, minor distributaries have the same arbolate sum as the main channel. After selecting the main channel, SFRmaker recomputes arbolate sum values for the minor distributaries, starting with 0 at the divergence. Lines with ending asums less than asum_thresh will then be culled.

known_connectionsdict, optional

Dictionary of specified flowline connections {COMID: tocomid}, which will override the routing selection at distributaries. By default None.

update_up_elevationsdict, optional

Dictionary of specified stage or streambed top elevations {COMID: elevation} at the upstream end of flowlines, for example, based on field measurements of streambed elevation or stage. The distinction between streambed elevation and stage depends on the context of the other elevations. For example, DEM elevations for larger streams typically reflect stage at the time data for the DEM (e.g. lidar) were collected. If all other COMID elevations represent stage, then the elevations supplied to update_up_elevations and update_dn_elevations should be stages. Streambed top is often poorly characterized and difficult to measure in the field. But if the SFR package is simulating stage (specified streambed bottom plus a simulated depth based on flow), appropriate streambed bottom input is needed to produce reasonable stages. The user may start with DEM elevations and then later subtract off simulated stream depths(from the SFR package output) to arrive at simulated stages that are close to those observed in the field.

update_up_elevations and update_dn_elevations input will override any other values (from NHDPlus or the DEM), and will be incorporated into the elevation smoothing. By default None.

update_dn_elevationsdict, optional

Dictionary of specified elevations {COMID: elevation} at the downstream end of flowlines. See the update_up_elevations description for more details.

update_up_elevations and update_dn_elevations input will override any other values (from NHDPlus or the DEM), and will be incorporated into the elevation smoothing. By default None.

width_from_asum_a_paramfloat, optional

\(a\) parameter used for estimating channel width from arbolate sum. Only needed if input flowlines are lacking width information. See width_from_arbolate(). By default, 0.1193.

width_from_asum_b_paramfloat, optional

\(b\) parameter used for estimating channel width from arbolate sum. Only needed if input flowlines are lacking width information. See width_from_arbolate(). By default, 0.5032.

minimum_widthfloat, optional

Minimum reach width to specify (in model units), if computing widths from arbolate sum values. (default = 1)

output_length_unitsstr, any length unit; e.g. {‘m’, ‘meters’, ‘ft’, etc.}

Units for width and elevation attribute values included with the output flowlines. Output arbolate sum values are specified in kilometers.

outfolderstr, optional

Location for writing output, by default ‘clipped_flowlines’

loggersfrmaker.logger instance, optional

Pass an existing sfrmaker.logger instance to logger the preprocessing operations, by default None

flowline_crsobj

Coordinate reference system of the NHDPlus data. Only needed if the data do not have a valid ESRI projection (.prj) file. A Python int, dict, str, or pyproj.crs.CRS instance passed to pyproj.crs.CRS.from_user_input()

Can be any of:

PROJ string
Dictionary of PROJ parameters
PROJ keyword arguments for parameters
JSON string with PROJ parameters
CRS WKT string
An authority string [i.e. ‘epsg:4326’]
An EPSG integer code [i.e. 4326]
A tuple of (“auth_name”: “auth_code”) [i.e (‘epsg’, ‘4326’)]
An object with a to_wkt method.
A pyproj.crs.CRS class

dest_crsobj

Output Coordinate reference system. Same input types as flowline_crs.

Returns:

flowlinesDataFrame

DataFrame with preprocessed flowlines. Width and elevation values are specified in the output_length_units. Output arbolate sum values are specified in kilometers. See NHDPlus documentation for description of fields not included here. Columns:

COMID	int64	NHDPlus Common Identifiers
tocomid	int64	Downstream routing connections (COMIDs)
nhd_asum	float	Arbolate sum from NHDPlus, in km
min	float	minimum elevation sampled within each buffer
mean	float	mean elevation sampled within each buffer
pct10,20,80	float	elevation percentiles sampled within each buffer
Divergence	int	NHDPlus Divergence classification
main_chan	bool	Flag indicating whether SFRmaker identified line as main channel
elevup	float	Smoothed elevation at line start (sampled from DEM)
elevdn	float	Smoothed elevation at line end (sampled from DEM)
elevupsmo	float	Smoothed elevation at line start (sampled from DEM)
elevdnsmo	float	Smoothed elevation at line end (sampled from DEM)
asum_calc	float	Recomputed arbolate sum for line (after removing distributaries)
asum_diff	float	Difference between recomputed and NHDPlus asums
width1asum	float	asum-based estimate for width at line start
width2asum	float	asum-based estimate for width at line end
narwd_n	int	number of values for line sampled from NARWidth
narwd_mean	float	mean elevation sampled from NARWidth
narwd_std	float	standard deviation in values sampled from NARWidth
narwd_min	float	minimum elevation sampled from NARWidth
narwd_max	float	maximum elevation sampled from NARWidth
is_wb	bool	Flag for whether the line coincides with a Waterbody
width1	float	estimated width at line start (from NARWidth or asum)
width2	float	estimated width at line end (from NARWidth or asum)
geometry	obj	Shapely LineString for each line
buffpoly	obj	Shapely Polygon (buffer) around each LineString

Raises:

ValueError: [description]
IOError: [description]

Notes

A key part of the preprocessing is handling divergences in the stream network, where flow is routed to two or more distributaries. Distributaries are common in the Mississippi Alluvial Plain (MAP) region, for example, but in reality, most of these features are either non-existent or only carry flow intermittently during high-water events. While distributaries in NHDPlus are classified as “main” and “minor” paths at divergences, inspection against the satellite imagery and the most recent, lidar-based DEMs for the MAP area suggested that the NHDPlus classifications are often inaccurate.

The following steps are taken to identify the main channel at each divergence:

A 50-meter buffer polygon is drawn around each flowline feature. A flat end-cap is used, so that only areas perpendicular to the flowlines are included in each buffer.
Zonal statistics for the lidar-based DEM values within each buffer polygon are computed using the rasterstats python package. The tenth percentile elevation is selected as a metric for discriminating between the main channel and minor distributaries. Lower elevation percentiles would be more likely to represent areas of overlap between the buffers for the main channel and minor distributaries (resulting in minor distributary values that are similar to the main channel), while higher elevation percentiles might miss the lowest parts of the main channel or even represent parts of the channel banks instead.
At each divergence, the distributary with the lowest tenth percentile elevation is assumed to be the main channel.

In the MAP region, comparison of the sampled DEM values with the NHDPlus elevation attribute data revealed a high bias in many of the attribute values, especially in the vicinity of diversions. This may be a result of the upstream smoothing process described by McKay and others (2012, p 123) when it encounters distributaries of unequal values such as the example shown in Figure 5. To remedy this issue, the 10th percentile values obtained from the buffer zonal statistics were assigned to each flowline, and then smoothed in the downstream direction to ensure that no flowlines had negative (uphill) slopes.

Finally, routing connections to minor distributaries are removed, and arbolate sums recomputed for the entire stream network, with arbolate sums at minor distributaries starting at zero. In this way, the minor distributaries are treated like headwater streams in that they will only receive flow if the water table is greater than their assigned elevation, otherwise they are simulated as dry and are not part of the groundwater model solution. Similar to cull_flowlines(), the first asum_thresh km of minor distributaries are trimmed from the stream network.

If a shapefile is specified for the narwidth_shapefile argument, the sample_narwidth() function is called.

sfrmaker.preprocessing.preprocess_nhdplus_hr_flowlines(nhdplus_path, active_area=None, keep_fcodes=None, drop_ids_upstream=None, drop_ids=None, drop_isolated=False, dest_crs=None, outfile='preprocessed_flowlines.shp')[source]

Preprocess NHDPlus HR flowline and associated attribute data to a shapefile.

Parameters:

nhdplus_pathstr, pathlike or list of pathlikes: Path to NHDPlus HR dataset(s) (FileGDB format)
active_areastr, shapely polygon or tuple, optional: A polygon shapefile, or shapely polygon or bounding box tuple (left, bottom, top, right) in the NAD83 GCS (EPSG:4269). The active area is converted to a bounding box, which is then used to filter the flowlines that are read in. If none, no filtering is performed, and the whole area encompased by the input NHDPlus data will be retained. Must be in the same CRS as nhdplus_path (typically 4269) if CRS information isn’t included in the object (i.e. if it is a tuple or polygon, or shapefile without a .prj file). by default, None
keep_fcodessequence, optional: Option to only retain certain line types (e.g. perennial streams), as indicated by their FCodes. by default None
drop_ids_upstreamsequence, optional: Option to drop specified flowlines and all flowlines upstream of them, by default None
drop_ids: sequence, optional: Option to drop specified flowlines.
drop_isolatedbool: Option to drop flowlines that have outlets lines that are wholly inside the active area. by default, False
dest_crsCRS-like, optional: Reproject preprocessed flowlines to this coorindate reference system (CRS), for example as specified by an EPSG code. See the documentation for pyproj.crs.CRS.from_user_input() for other examples, by default None
outfilestr or pathlike, optional: Output shapefile path, by default ‘preprocessed_flowlines.shp’

sfrmaker.preprocessing.preprocess_nhdplus_hr_waterbodies(nhdplus_path, active_area, drop_waterbodies=None, min_areasqkm=0.0, dest_crs=None, outfile='preprocessed_waterbodies.shp')[source]

_summary_

Parameters:

nhdplus_pathstr, pathlike or list of pathlikes: Path to NHDPlus HR dataset(s) (FileGDB format)
active_areastr, shapely polygon or tuple, optional: A polygon shapefile, or shapely polygon or bounding box tuple (left, bottom, top, right). Must be in the same CRS as nhdplus_path (typically 4269) if CRS information isn’t included in the object (i.e. if it is a tuple or polygon, or shapefile without a .prj file). by default, None
drop_waterbodiessequence, optional: Option to drop specified waterbodies (by NHDPlusID), by default None
min_areasqkmfloat, optional: Exclude waterbodies smaller than this size in the output, by default 0.
dest_crsCRS-like, optional: Reproject preprocessed flowlines to this coorindate reference system (CRS), for example as specified by an EPSG code. See the documentation for pyproj.crs.CRS.from_user_input() for other examples, by default None
outfilestr or pathlike, optional: Output shapefile path, by default ‘preprocessed_waterbodies.shp’

sfrmaker.preprocessing.recompute_asums_for_minor_distribs(minor_distrib_comids, fl_lengths, graph, graph_r)[source]

Reset arbolate sums for minor distributaries and downstream segments in their path, to the next confluence.

Parameters:

minor_distrib_comidssequence: Sequence of line identifiers for minor distributaries (that were part of a divergence in NHDPlus).
fl_lengthsdict: Dictionary of flowline lengths {indentifier (e.g. comid): length values}, in same units as asums.
graphdict: Dictionary of downstream routing connections {fromcomid: tocomid}
graph_rdict: Dictionary of upstream routing connections {tocomid: {fromcomid1, fromcomid2,…}}

Returns:

new_asumsdict: Dictionary of recomputed arbolate sums {comid: asum value}

sfrmaker.preprocessing.sample_NARWidth(flowlines, narwidth_shapefile, waterbody_shapefiles, bbox_filter=None, output_width_units='meters', outpath='shps/')[source]

Sample the North American River Width Database by doing a spatial join (transfer width information from NARWidth shapefile to flowlines shapefile based on proximity).

Parameters:

flowlinesGeoDataFrame: flowlines dataframe from preprocess_nhdplus(). Flowlines must be in a projected Coordinate reference system (CRS), specified by a valid .crs attribute.
narwidth_shapefilestr: Path to shapefile from the NARWidth database (Allen and Pavelsky, 2015).
waterbody_shapefilesstr or list of strings, optional: Path(s) to NHDPlus NHDWaterbody shapefile(s). Only required if a narwidth_shapefile is specified.
filtertuple: Bounds (most likely in lat/lon) for filtering NARWidth lines that are read in (left, bottom, right, top)
output_width_unitsstr, any length unit; e.g. {‘m’, ‘meters’, ‘ft’, etc.}: Units for width and elevation attribute values included with the output flowlines. NARWidth widths are assumed to be in meters.

Returns:

This function operates on the fl DataFrame in place.

Notes

To avoid erroneous overlap between main-stem NARWidth estimates and minor tributaries, flowlines with arbolate sums less than 500 km only receive widths from NARWidth lines that have at least 50% of their length inside of the 1-km buffer. NARWidth values are generally higher than arbolate sum-based estimates, because the NARWidth estimates represent mean flows and include all reaches of the stream, whereas the arbolate sum estimates are based on field measurements taken at narrower than average, well-behaved channel sections near stream gages, under base flow conditions. Therefore, measured channel widths may be biased low compared to actual widths throughout the stream network (Allen and Pavelsky, 2015; Park, 1977).

sfrmaker.preprocessing.swb_runoff_to_csv(swb_runoff_netcdf_output, nhdplus_catchments_file, runoff_output_variable='runoff', swb_rejected_net_inf_output=None, rejected_net_inf_variable='rejected_net_infiltration', catchment_id_col='FEATUREID', limit_runoff_to_area=None, start_datetime=None, end_datetime=None, output_length_units='meters', include_xy_in_output=False, xy_crs=None, outfile='swb_runoff_by_nhdplus_comid.csv')[source]

Convert gridded runoff estimates from the USGS Soil Water Balance Code (SWB) to a CSV format, associating each runoff value with a catchment ID. For the runoff to be successfully mapped to the SFR network, the catchment IDs must correspond to line IDs in the stream network, and each catchment must either be associated with a line, or be referenced in the flowline_routing dataset. See the documentation for sfrmaker.flows.add_to_perioddata() for more details.

Parameters:

swb_runoff_netcdf_outputstr or path-like: NetCDF file with gridded SWB runoff estimates. Runoff values are assumed to be in units of length/time (normalized to area). The CRS for the runoff dataset is read from the ‘proj4_string’ attribute, and is assumed to have length units of meters.
nhdplus_catchments_filestr or path-like: Shapefile of catchments with IDs (catchment_id_col). The catchments are rasterized to the grid in swb_netcdf_output, so that each runoff value can be associated with a catchment ID.
catchment_id_colstr: Field in nhdplus_catchments_file with ID, by default ‘FEATUREID’.
runoff_output_variablestr, optional: Variable in swb_runoff_netcdf_output with runoff data, by default ‘runoff’
swb_rejected_net_inf_outputstr or path-like: NetCDF file with gridded SWB estimates of “rejected net infiltration”. This is infiltration in the SWB simulation in excess of the specified max infiltration rate. In reality this may represent a component of runoff or ‘quick’ overland flow, depending on the timescale of the simulation. Values are assumed to be in units of length/time (normalized to area). The CRS for the runoff dataset is read from the ‘proj4_string’ attribute, and is assumed to have length units of meters.
rejected_net_inf_variablestr, optional: Variable in swb_runoff_netcdf_output with runoff data, by default ‘rejected_net_infiltration’
limit_runoff_to_areastr or path-like: Optionally limit runoff to an area defined by a polygon shapefile. By default None, in which case runoff is aggregated for each NHDPlus Catchment that intersections the grid in swb_runoff_netcdf_output.
start_datetimestr: Include runoff on or after this date in the output. Can be in any string format used for time slicing in pandas or xarray. By default, None (include all times in swb_runoff_netcdf_output).
end_datetimestr: Include runoff on or before this date in the output. Can be in any string format used for time slicing in pandas or xarray. By default, None (include all times in swb_runoff_netcdf_output).
output_length_unitsstr, optional: Input units are read from the ‘units’ attribute of the netcdf_output_variable in swb_netcdf_output; specify output length units so that the output CSV file is in the same length units as the model. No time unit conversion is perfo by default ‘meters’
include_xy_in_outputbool: Option to include approximate x, y coordinates of catchments in the output. The x, y coordinates represent the average x and y values of the SWB grid cells intersected by each catchment.
xy_crs: Output coordinate reference system (CRS) for coordinates in outfile. Only used if include_xy_in_output=True. By default False, in which case the CRS for swb_runoff_netcdf_output is used.
outfilestr, optional: Output CSV file with transient runoff volumes for each catchment intersecting the SWB grid, by default ‘swb_runoff_by_nhdplus_comid.csv’

Returns:

aggregatedDataFrame (also written to outfile)

DataFrame with columns:

time	time associated with each runoff volume
comid	catchment and flowline identifier
x	(optional) approximate x-coordinate of catchment, in xy_crs
y	(optional) approximate y-coordinate of catchment, in xy_crs
runoff_{L}3d	runoff volume, in model length units cubed per day
area_{L}2	total area of SWB cells intersected by catchment, in model length units cubed

Notes

Some notes on SWB output variables, as of 7/5/2023:

'runoff_outside' is the sum of 'runoff' plus 'rejected_net_infiltration'.

When routing is inactive in the SWB simulation, 'runoff_outside' is computed for every SWB grid cell. In this case, it can be supplied here to swb_runoff_netcdf_output (and runoff_output_variable), in lieu of supplying both swb_runoff_netcdf_output and swb_rejected_net_inf_output.

When routing is active, 'runoff_outside' represents only water that cannot be moved downslope; in most cases it should be zero in the model interior. In this case, one would most likely supply both swb_runoff_netcdf_output and swb_rejected_net_inf_output.