Skip to contents

This function performs winsorization on a numeric vector by capping extreme values at calculated lower and upper thresholds. Thresholds can be based on either standard deviation (assuming approximate normality) or interquartile range (robust to skewed distributions).

Usage

windsorize(Data, sdlim = 2.5, iqrlim = 1.5, method = "sd")

Arguments

Data

A numeric vector to be winsorized.

sdlim

Numeric. Number of standard deviations for the "sd" method.

iqrlim

Numeric. Multiplier for the IQR when method = "iqr" (default 1.5).

method

Character string specifying the method: "sd" (default) or "iqr".

Value

A numeric vector with values winsorized to the specified thresholds.

Examples

x <- c(rnorm(100), 10, 15, -12)

# SD-based winsorization
windsorize(x, method = "sd", sdlim = 2.5)
#>   [1] -1.400043517  0.255317055 -2.437263611 -0.005571287  0.621552721
#>   [6]  1.148411606 -1.821817661 -0.247325302 -0.244199607 -0.282705449
#>  [11] -0.553699384  0.628982042  2.065024895 -1.630989402  0.512426950
#>  [16] -1.863011492 -0.522012515 -0.052601910  0.542996343 -0.914074827
#>  [21]  0.468154420  0.362951256 -1.304543545  0.737776321  1.888504929
#>  [26] -0.097445104 -0.935847354 -0.015950311 -0.826788954 -1.512399651
#>  [31]  0.935363190  0.176488611  0.243685465  1.623548883  0.112038083
#>  [36] -0.133997013 -1.910087468 -0.279237242 -0.313445978  1.067307879
#>  [41]  0.070034850 -0.639123324 -0.049964899 -0.251483443  0.444797116
#>  [46]  2.755417575  0.046531380  0.577709069  0.118194874 -1.911720491
#>  [51]  0.862086482 -0.243236740 -0.206087195  0.019177592  0.029560754
#>  [56]  0.549827542 -2.274114857  2.682557184 -0.361221255  0.213355750
#>  [61]  1.074345882 -0.665088249  1.113952419 -0.245896412 -1.177563309
#>  [66] -0.975850616  1.065057320  0.131670635  0.488628809 -1.699450568
#>  [71] -1.470736306  0.284150344  1.337320413  0.236696283  1.318293384
#>  [76]  0.523909788  0.606748047 -0.109935672  0.172181715 -0.090327287
#>  [81]  1.924343341  1.298392759  0.748791268  0.556224329 -0.548257264
#>  [86]  1.110534893 -2.612334333 -0.155693776  0.433889790 -0.381951112
#>  [91]  0.424187575  1.063101996  1.048712620 -0.038102895  0.486148920
#>  [96]  1.672882611 -0.354361164  0.946347886  1.316826356 -0.296640025
#> [101]  6.130267814  6.130267814 -5.740385864

# IQR-based winsorization
windsorize(x, method = "iqr", iqrlim = 1.5)
#>   [1] -1.400043517  0.255317055 -1.919546796 -0.005571287  0.621552721
#>   [6]  1.148411606 -1.821817661 -0.247325302 -0.244199607 -0.282705449
#>  [11] -0.553699384  0.628982042  2.065024895 -1.630989402  0.512426950
#>  [16] -1.863011492 -0.522012515 -0.052601910  0.542996343 -0.914074827
#>  [21]  0.468154420  0.362951256 -1.304543545  0.737776321  1.888504929
#>  [26] -0.097445104 -0.935847354 -0.015950311 -0.826788954 -1.512399651
#>  [31]  0.935363190  0.176488611  0.243685465  1.623548883  0.112038083
#>  [36] -0.133997013 -1.910087468 -0.279237242 -0.313445978  1.067307879
#>  [41]  0.070034850 -0.639123324 -0.049964899 -0.251483443  0.444797116
#>  [46]  2.245134768  0.046531380  0.577709069  0.118194874 -1.911720491
#>  [51]  0.862086482 -0.243236740 -0.206087195  0.019177592  0.029560754
#>  [56]  0.549827542 -1.919546796  2.245134768 -0.361221255  0.213355750
#>  [61]  1.074345882 -0.665088249  1.113952419 -0.245896412 -1.177563309
#>  [66] -0.975850616  1.065057320  0.131670635  0.488628809 -1.699450568
#>  [71] -1.470736306  0.284150344  1.337320413  0.236696283  1.318293384
#>  [76]  0.523909788  0.606748047 -0.109935672  0.172181715 -0.090327287
#>  [81]  1.924343341  1.298392759  0.748791268  0.556224329 -0.548257264
#>  [86]  1.110534893 -1.919546796 -0.155693776  0.433889790 -0.381951112
#>  [91]  0.424187575  1.063101996  1.048712620 -0.038102895  0.486148920
#>  [96]  1.672882611 -0.354361164  0.946347886  1.316826356 -0.296640025
#> [101]  2.245134768  2.245134768 -1.919546796