Skip to contents

This function keeps track of how many observations you exclude by using specific inclusion and exclusion criteria. It assumes that your criteria are logical filter statements, i.e. statements that you would pass to dplyr::filter() or to {data.table}.

Usage

exclusion_table(
  data = NULL,
  inclusion_criteria = NULL,
  exclusion_criteria = NULL,
  labels_inclusion = inclusion_criteria,
  labels_exclusion = exclusion_criteria,
  obj = NULL,
  keep_data = TRUE,
  id = NULL
)

Arguments

data

A dataframe on which the exclusions are to be performed.

inclusion_criteria

A character vector of logical expressions that are used for inclusions. All individuals who meet these criteria will be included. Specifically, observations for which the logical expression is FALSE will be excluded. Please keep in mind how your expression will handle NA values.

exclusion_criteria

A character vector of logical expressions that are used for exclusions. All observations who meet this criteria will be excluded. Specifically, observations for which the logical expression is TRUE will be excluded. Please keep in mind how your expression will handle NA values.

labels_inclusion

An optional character vector of labels that are used to label the steps of inclusions. The default labels are the logical expressions passed to inclusion_criteria

labels_exclusion

An optional character vector of labels that are used to label the steps of exclusions. The default labels are the logical expressions passed to exclusion_criteria.

obj

A named list of objects that will be passed to the filtering call. The list can be access using obj$<name of object> in the filtering call.

keep_data

A logical statement to indicate whether the new dataset without the excluded observations should be outputted. The default is TRUE.

id

Optional name of a unique ID variable in the dataset.

Value

exclusion_table returns a exl_tbl object which is a list of data frames including the following information:

table_in

a data.frame including the number of observations excluded for each inclusion criteria listed in inclusion_criteria.

table_ex

a data.frame including the number of observations excluded for each exclusion criteria listed in exclusion_criteria.

dataset

a data.frame of the supplied dataset after applying all inclusion and exclusion criteria.

If id is supplied, an additional column is added to table_in and table_ex including a list of the ids that have been excluded from the dataset in each step.

Examples

#Example without using the obj argument
exclusion_table(
   data = mtcars,
   exclusion_criteria = c("disp <= 70 | disp >= 300",
                          "as.character(gear) == '4'"),
   labels_exclusion   = c("First exclusion",
                          "Second exclusion")
)
#> 
#> =============================================
#> Excluded the following observations:
#> =============================================
#> Exclusions based on EXCLUSION criteria
#> 
#>          exclusion n_prior n_post n_excluded
#> 1  First exclusion      32     21         11
#> 2 Second exclusion      21      9         12
#> 3            TOTAL      32      9         23
#> 
#> =============================================
#> 

#Example using the obj argument
my_selection <- c(8, 6)

exclusion_table(
  data = mtcars,
  exclusion_criteria = c("cyl %in% my_selection"),
  labels_exclusion   = c("First exclusion"),
  obj = list(my_selection = my_selection)
)
#> Error in eval(parse(text = filter_string)): object 'my_selection' not found