General Group Enrichment Analysis

This function takes a data.frame as input, compares proportion of positive cases or mean measure in one subgroup and the remaining samples.

group_enrichment(
  df,
  grp_vars = NULL,
  enrich_vars = NULL,
  cross = TRUE,
  co_method = c("t.test", "wilcox.test")
)

Arguments

df	a `data.frame`.
grp_vars	character vector specifying group variables to split samples into subgroups (at least 2 subgroups, otherwise this variable will be skipped).
enrich_vars	character vector specifying measure variables to be compared. If variable is not numeric, only binary cases are accepted in the form of `TRUE/FALSE` or `P/N` (P for positive cases and N for negative cases). Of note, `NA` values set to negative cases.
cross	logical, default is `TRUE`, combine all situations provided by `grp_vars` and `enrich_vars`. For examples, `c('A', 'B')` and `c('C', 'D')` will construct 4 combinations(i.e. "AC", "AD", "BC" and "BD"). A variable can not be in both `grp_vars` and `enrich_vars`, such cases will be automatically drop. If `FALSE`, use pairwise combinations, see section "examples" for use cases.
co_method	test method for continuous variable, default is 't.test'.

Value

a data.table with following columns:

grp_var: group variable name.
enrich_var: enrich variable (variable to be compared) name.
grp1: the first group name, should be a member in grp_var column.
grp2: the remaining samples, marked as 'Rest'.
grp1_size: sample size for grp1.
grp1_pos_measure: for binary variable, it stores the proportion of positive cases in grp1; for continuous variable, it stores mean value.
grp2_size: sample size for grp2.
grp2_pos_measure: same as grp1_pos_measure but for grp2.
measure_observed: for binary variable, it stores odds ratio; for continuous variable, it stores scaled mean ratio.
measure_tested: only for binary variable, it stores estimated odds ratio and its 95% CI from fisher.test().
p_value: for binary variable, it stores p value from fisher.test(); for continuous variable, it stores value from wilcox.test() or t.test().
type: one of "binary" and "continuous".
method: one of "fish.test", "wilcox.test" and "t.test".

Examples

set.seed(1234)
df <- dplyr::tibble(
  g1 = factor(abs(round(rnorm(99, 0, 1)))),
  g2 = rep(LETTERS[1:4], c(50, 40, 8, 1)),
  e1 = sample(c("P", "N"), 99, replace = TRUE),
  e2 = rnorm(99)
)

print(str(df))
print(head(df))

# Compare g1:e1, g1:e2, g2:e1 and g2:e2
x1 <- group_enrichment(df, grp_vars = c("g1", "g2"), enrich_vars = c("e1", "e2"))
x1

# Only compare g1:e1, g2:e2
x2 <- group_enrichment(df,
  grp_vars = c("g1", "g2"),
  enrich_vars = c("e1", "e2"),
  co_method = "wilcox.test",
  cross = FALSE
)
x2

# Visualization
p1 <- show_group_enrichment(x1, fill_by_p_value = TRUE)
p1
p2 <- show_group_enrichment(x1, fill_by_p_value = FALSE)
p2
p3 <- show_group_enrichment(x1, return_list = TRUE)
p3

Arguments

Value

See also

Examples