Skip to contents

Calculates a random subset of the data based on repeated values from a specified column.

Usage

randomSubset(df, colName, seed = NA)

Arguments

df

data frame. Must include a column named by the argument colName.

colName

column name to check for duplicates

seed

integer value. Defaults to NA, which will not change the current seed. Setting the seed to any given value can be used to create repeatable output.

Examples

df <- data.frame(Julian = c(1,2,2,3,4,4,4,6),
                 y = 1:8)
df
#>   Julian y
#> 1      1 1
#> 2      2 2
#> 3      2 3
#> 4      3 4
#> 5      4 5
#> 6      4 6
#> 7      4 7
#> 8      6 8
df_random <- randomSubset(df, "Julian")
df_random
#>   Julian y
#> 1      1 1
#> 3      2 3
#> 4      3 4
#> 5      4 5
#> 8      6 8