Skip to contents

Draws a split (left/right) violin for up to two groups at a single x-position, overlays per-group boxplots aligned with each half, and optionally annotates the plot with a p-value significance label (e.g., "", "", "", "ns"). The y-axis title is taken from the variable's label attribute when available (via labelled or Hmisc); otherwise the column name is used.

Usage

PlotSplitViolin(
  data,
  Var,
  Group,
  covars = c("Age"),
  nonparametric = FALSE,
  annotation_text = NULL,
  show_ns = FALSE,
  left_group = NULL,
  color_palette = NULL,
  box_offset = 0.11,
  box_width = 0.15,
  star_from = c("quantile", "data_max", "whisker"),
  star_quantile = 0.995,
  star_pad = 0.03,
  headroom = 0.08,
  star_size = 6
)

Arguments

data

A data frame containing Var, Group, and any covariates.

Var

Column of the numeric outcome to plot. Tidy-eval friendly (unquoted or string).

Group

Column of the grouping variable. Must have 1 or 2 unique values. No factor relabeling is performed inside; set levels externally if needed.

covars

Character vector of covariate column names (default "Age"). Use character(0) for no covariates.

nonparametric

Logical. If FALSE (default), compute the p-value using a linear model and an emmeans contrast on Group. If TRUE: with no covariates, use Wilcoxon rank-sum on Var ~ Group; with covariates, first residualize Var on the covariates (no Group), then apply Wilcoxon rank-sum to the residuals by Group. With only one group, no test is performed.

annotation_text

Optional label to draw (e.g., "*", "**", "***", "ns", "★"). If NULL, the label is derived from the computed p-value.

show_ns

Logical; if TRUE, show "ns" for non-significant results when annotation_text is NULL.

left_group

Optional. The Group value to place on the left half. If omitted, the first factor level (or first observed value) is used.

color_palette

Optional color specification. Supply a named vector keyed by Group values; if unnamed, values are interpreted as (left, right).

box_offset

Horizontal offset for the left/right boxplots from the center (single x = 1). Small values (e.g., 0.10-0.14) sit the boxes inside each half.

box_width

Boxplot width (default 0.15). For clean separation, keep box_width <= 2 * box_offset.

star_from

Where to anchor the vertical position of the significance label: one of "quantile" (default), "data_max", or "whisker".

star_quantile

Quantile used when star_from = "quantile" (default 0.995).

star_pad

Fraction of the y-range added above the anchor for the label (default 0.03).

headroom

Fraction of the y-range added to the top limit to avoid clipping (default 0.08).

star_size

Text size for the significance label (default 6).

Value

A ggplot2 object.

Details

Geometry and alignment. Each half of the violin is drawn with gghalves::geom_half_violin(), then two explicit boxplot layers (left and right) are drawn at fixed x-positions (1 - box_offset and 1 + box_offset), so the boxes always align with their corresponding half and color (no dodging flips).

Significance label. When annotation_text is NULL, the p-value is computed as described and converted to "*", "**", "***", or (optionally) "ns". The label is drawn at x = 1 via ggpubr::stat_pvalue_manual(remove.bracket = TRUE). Vertical placement is anchor + star_pad * (range_y), where the anchor is chosen by star_from. The top limit is expanded by headroom so the label never clips.

Axis label. If Var has a label attribute (from labelled or Hmisc), that text is used for the y-axis title; otherwise the column name is used.

See also

geom_half_violin, stat_pvalue_manual, emmeans

Examples

if (FALSE) { # \dontrun{
PlotSplitViolin(df_analysis, Var = Tryptophan, Group = STATUS)
PlotSplitViolin(
  df_analysis, Var = Tryptophan, Group = STATUS,
  left_group = "Seropositive",
  color_palette = c(Seropositive = "#E8A007", Seronegative = "#8C1A45")
)
PlotSplitViolin(df_analysis, Var = Tryptophan, Group = STATUS,
                covars = "Age", nonparametric = FALSE)
PlotSplitViolin(df_analysis, Var = Tryptophan, Group = STATUS,
                covars = "Age", nonparametric = TRUE, show_ns = TRUE)
} # }