A sample dataset adapted from the AppliedPredictive Modeling R package with additional simulated variables, noise and missingness added. The original dataset Craig-Schapiro et al. (2011) describes a clinical study of 333 patients where laboratory measurements are used to predict which subjects are most likely to develop cognitive impairment, such as Alzheimer's disease.