Prepare data for UpSet plots

upset_data(
  data,
  intersect,
  min_size = 0,
  max_size = Inf,
  min_degree = 0,
  max_degree = Inf,
  n_intersections = NULL,
  keep_empty_groups = FALSE,
  warn_when_dropping_groups = FALSE,
  warn_when_converting = "auto",
  sort_sets = "descending",
  sort_intersections = "descending",
  sort_intersections_by = "cardinality",
  sort_ratio_numerator = "exclusive_intersection",
  sort_ratio_denominator = "inclusive_union",
  group_by = "degree",
  mode = "exclusive_intersection",
  size_columns_suffix = "_size",
  encode_sets = FALSE,
  max_combinations_datapoints_n = 10^10,
  intersections = "observed"
)

Arguments

data

a dataframe including binary columns representing membership in classes

intersect

which columns should be used to compose the intersection

min_size

minimal number of observations in an intersection for it to be included

max_size

maximal number of observations in an intersection for it to be included

min_degree

minimal degree of an intersection for it to be included

max_degree

maximal degree of an intersection for it to be included

n_intersections

the exact number of the intersections to be displayed; n largest intersections that meet the size and degree criteria will be shown

keep_empty_groups

whether empty sets should be kept (including sets which are only empty after filtering by size)

warn_when_dropping_groups

whether a warning should be issued when empty sets are being removed

warn_when_converting

whether a warning should be issued when input is not boolean

sort_sets

whether to sort the rows in the intersection matrix (descending sort by default); one of: 'ascending', 'descending', FALSE

sort_intersections

whether to sort the columns in the intersection matrix (descending sort by default); one of: 'ascending', 'descending', FALSE

sort_intersections_by

the mode of sorting, the size of the intersection (cardinality) by default; one of: 'cardinality', 'degree', 'ratio', or any combination of these (e.g. c('degree', 'cardinality'))

sort_ratio_numerator

the mode for numerator when sorting by ratio

sort_ratio_denominator

the mode for denominator when sorting by ratio

group_by

the mode of grouping intersections; one of: 'degree', 'sets'

mode

region selection mode for sorting and trimming by size. See get_size_mode() for accepted values.

size_columns_suffix

suffix for the columns to store the sizes (adjust if conflicts with your data)

encode_sets

whether set names (column in input data) should be encoded as numbers (set to TRUE to overcome R limitations of max 10 kB for variable names for datasets with huge numbers of sets); default TRUE for upset() and FALSE for upset_data()

max_combinations_datapoints_n

a fail-safe limit preventing accidental use of intersections='all' with a high number of sets and observations

intersections

whether only the intersections present in data (observed, default), or all intersections (all) should be computed; using all intersections for a high number of sets is not computationally feasible - use min_degree and max_degree to narrow down the selection; this is only useful for modes different from the default exclusive intersection. You can also provide a list with a custom selection of intersections (order is respected when you set sort_intersections=FALSE)