This function provides a mechanism to specify 3 levels of information in the
supplied data frame end_point_info
to be used in subsequent analysis steps.
First, the user specifies the ToxCast assay annotation using the 'groupCol'
argument, which is a column header in 'end_point_info'. Second, the user
specifies the families of assays to exclude. Finally, the user can choose to
remove specific group(s) from the category. The default is to remove
'Background Measurement' and 'Undefined'. Choices for this should be
reconsidered based on individual study objectives.
Arguments
- ep
Data frame containing Endpoint information from ToxCast
- groupCol
Character name of ToxCast annotation column to use as a group category
- remove_assays
Vector of assays to EXCLUDE in the data analysis. By default, the "BSK" (BioSeek) assay is removed.
- remove_groups
Vector of groups within the selected 'groupCol' to remove.
Details
The default category ('groupCol') is 'intended_target_family'. Depending on the study, other categories may be more relevant. The best resource on these groupings is the "ToxCast Assay Annotation Data User Guide". It defines "intended_target_family" as "the target family of the objective target for the assay". Much more detail can be discovered in that documentation.
Examples
end_point_info <- end_point_info
cleaned_ep <- clean_endPoint_info(end_point_info)
filtered_ep <- filter_groups(cleaned_ep)
head(filtered_ep)
#> # A tibble: 6 × 3
#> endPoint groupCol assaysFull
#> <chr> <chr> <chr>
#> 1 ACEA_ER_80hr Nuclear Receptor ACEA
#> 2 APR_HepG2_CellCycleArrest_1hr Cell Cycle APR
#> 3 APR_HepG2_CellLoss_1hr Cell Cycle APR
#> 4 APR_HepG2_MicrotubuleCSK_1hr Cell Morphology APR
#> 5 APR_HepG2_MitoMass_1hr Cell Morphology APR
#> 6 APR_HepG2_MitoMembPot_1hr Cell Morphology APR