Given a set of flowline indexes and numeric or ascii criteria, return closest match. If numeric criteria are used, the minimum difference in the numeric attribute is used for disambiguation. If ascii criteria are used, the adist function is used with the following algorithm: `1 - adist_score / max_string_length`. Comparisons ignore case.

disambiguate_flowline_indexes(indexes, flowpath, hydro_location)

Arguments

indexes

data.frame as output from get_flowline_index with more than one hydrologic location per indexed point.

flowpath

data.frame with two columns. The first should join to the COMID field of the indexes and the second should be the numeric or ascii metric such as drainage area or GNIS Name. Names of this data.frame are not used.

hydro_location

data.frame with two columns. The first should join to the id field of the indexes and the second should be the numeric or ascii metric such as drainage area or GNIS Name.. Names of this data,frame are not used.

Value

data.frame indexes deduplicated according to the minimum difference between the values in the metric columns. If two or more result in the same "minimum" value, duplicates will be returned.

Examples

source(system.file("extdata", "sample_flines.R", package = "nhdplusTools"))

hydro_location <- sf::st_sf(id = c(1, 2, 3),
                            geom = sf::st_sfc(list(sf::st_point(c(-76.86934, 39.49328)),
                                                   sf::st_point(c(-76.91711, 39.40884)),
                                                   sf::st_point(c(-76.88081, 39.36354))),
                                              crs = 4326),
                            totda = c(23.6, 7.3, 427.9),
                            nameid = c("Patapsco", "", "Falls Run River"))

flowpath <- dplyr::select(sample_flines,
                          comid = COMID,
                          totda = TotDASqKM,
                          nameid = GNIS_NAME,
                          REACHCODE,
                          ToMeas,
                          FromMeas)

indexes <- get_flowline_index(flowpath,
                              hydro_location,
                              search_radius = 0.2,
                              max_matches = 10)
#> Warning: search_radius units not set, trying units of points CRS.
#> Warning: converting to LINESTRING, this may be slow, check results

disambiguate_flowline_indexes(indexes,
                              dplyr::select(flowpath, comid, totda),
                              dplyr::select(hydro_location, id, totda))
#> # A tibble: 3 × 5
#>      id    COMID REACHCODE      REACH_meas    offset
#>   <dbl>    <int> <chr>               <dbl>     <dbl>
#> 1     1 11688298 02060003000579        0   0.0000603
#> 2     2 11688808 02060003000519       53.6 0.000564 
#> 3     3 11688950 02060003000254       18.5 0.00113  

result <- disambiguate_flowline_indexes(indexes,
                                        dplyr::select(flowpath, comid, nameid),
                                        dplyr::select(hydro_location, id, nameid))

result[result$id == 1, ]
#> # A tibble: 3 × 5
#>      id    COMID REACHCODE      REACH_meas  offset
#>   <dbl>    <int> <chr>               <dbl>   <dbl>
#> 1     1 11689928 02060003001468          0 0.00203
#> 2     1 11689978 02060003001472        100 0.00203
#> 3     1 11690532 02060003000256          0 0.00451

result[result$id == 2, ]
#> # A tibble: 10 × 5
#>       id    COMID REACHCODE      REACH_meas   offset
#>    <dbl>    <int> <chr>               <dbl>    <dbl>
#>  1     2 11688808 02060003000519       53.6 0.000564
#>  2     2 11690110 02060003001493      100   0.00742 
#>  3     2 11688822 02060003000518       39.5 0.00768 
#>  4     2 11688742 02060003000521        0   0.00855 
#>  5     2 11688778 02060003000520        0   0.00855 
#>  6     2 11690112 02060003001494      100   0.00955 
#>  7     2 11690122 02060003001495      100   0.0103  
#>  8     2 11688868 02060003000517       27.8 0.0169  
#>  9     2 11690124 02060003001496      100   0.0195  
#> 10     2 11690128 02060003001498      100   0.0210  

result[result$id == 3, ]
#> # A tibble: 1 × 5
#>      id    COMID REACHCODE      REACH_meas  offset
#>   <dbl>    <int> <chr>               <dbl>   <dbl>
#> 1     3 11688948 02060003000516          0 0.00321