CombineCodebooks
compares two codebook data frames (an old version and a new version), identifies added or removed variables and columns, detects cell-by-cell differences, and produces a combined codebook. For records without any differences, it returns a single "Combined" row; for records with differences, it returns both the "Old" and "New" rows, flagged as conflicts.
Arguments
- OldCodebook
A data.frame or tibble representing the old codebook. Each row must correspond to a single variable entry.
- NewCodebook
A data.frame or tibble representing the new codebook. Must have the same structure (column names) as
OldCodebook
, though extra or missing columns will be handled.- key
A string giving the name of the key column to identify variables (e.g. "Variable"). Defaults to "Variable".
Value
A list with elements:
added_variables
: character vector of keys present only inNewCodebook
.removed_variables
: character vector of keys present only inOldCodebook
.columns_added
: character vector of column names present only inNewCodebook
.columns_removed
: character vector of column names present only inOldCodebook
.value_differences
: tibble of cell-level differences (RowID
,Field
,OldValue
,NewValue
).combined_df
: tibble containing the merged codebook with versions and conflict flag.
Details
This function:
Builds a
combined_df
that includes all columns from both inputs:For non-conflicting records, a single row with
Version = "Combined"
.For conflicting records, two rows: one with
Version = "Old"
and one withVersion = "New"
.
Flags each row with
CHECKFORCONFLICTS
(0 for combined, 1 for old/new conflict rows).