Audit a merged dataset against the two source datasets used to create it.
This function is designed to catch common merge problems before analysis,
including incompatible key types, duplicate key combinations, unexpected
overlapping variables, unresolved .x / .y variables, and value conflicts
between duplicated variables.
Value
A list with merge validation results, including:
- SummaryText
A plain-text summary of the merge audit.
- ReadyForAnalysis
Logical value indicating whether major merge-integrity issues were absent.
- Summary
One-row tibble with core merge metrics.
- Fingerprint
Tibble comparing rows, columns, and unique key combinations across datasets.
- KeyTypes
Tibble showing key variable classes before coercion.
- Checks
Tibble summarizing validation checks and pass/warning/fail status.
- SuggestedActions
Tibble with suggested review steps.
- Relationship
Tibble describing key uniqueness in LeftData, RightData, and MergedData.
- IDCoverage
List containing Matching, LeftOnly, and RightOnly key combinations.
- DuplicateKeys
List containing duplicated key rows from Left, Right, and Merged datasets.
- OverlappingVariables
Tibble of variables present in both source datasets but not listed as keys.
- PotentialMergeRisk
Tibble of overlap variables that could have affected an unspecified dplyr join.
- JoinAudit
Tibble showing variables present in both source datasets and whether they were specified as merge keys.
- OverlapAudit
Tibble auditing all source variables by source presence and key status.
- UnresolvedDuplicateVariables
Tibble of
.x/.yduplicate variable pairs still present in MergedData.- DuplicateVariables
Tibble with agreement, conflict counts, and classes for duplicated variables.
- SuspiciousConflicts
Subset of duplicated variables with low agreement or mismatched classes.
- VariableConflicts
Long-format tibble of record-level value conflicts.
