randomSubset — randomSubset • EGRET

header_tag.html

Calculates a random subset of the data based on repeated values from a specified column.

Usage

randomSubset(df, colName, seed = NA)

Arguments

df: data frame. Must include a column named by the argument colName.
colName: column name to check for duplicates
seed: integer value. Defaults to NA, which will not change the current seed. Setting the seed to any given value can be used to create repeatable output.

Examples

df <- data.frame(Julian = c(1,2,2,3,4,4,4,6),
                 y = 1:8)
df
#>   Julian y
#> 1      1 1
#> 2      2 2
#> 3      2 3
#> 4      3 4
#> 5      4 5
#> 6      4 6
#> 7      4 7
#> 8      6 8
df_random <- randomSubset(df, "Julian")
df_random
#>   Julian y
#> 1      1 1
#> 2      2 2
#> 4      3 4
#> 6      4 6
#> 8      6 8