Converts ordered variables to numeric and Adds deterministic uniform noise. See Details.

equi_jitter(x)

Arguments

x

observations; the function does nothing if x is already numeric.

Details

Jittering makes discrete variables continuous by adding noise. This simple trick allows to consistently estimate densities with tools designed for the continuous case (see, Nagler, 2018a/b). The drawback is that estimates are random and the noise may deteriorate the estimate by chance.

Here, we add a form of deterministic noise that makes estimators well behaved. Tied occurences of a factor level are spread out uniformly (i.e., equidistantly) on the interval \([-0.5, 0.5]\). This is similar to adding random noise that is uniformly distributed, conditional on the observed outcome. Integrating over the outcome, one can check that the unconditional noise distribution is also uniform on \([-0.5, 0.5]\).

Asymptotically, the deterministic jittering variant is equivalent to the random one.

References

Nagler, T. (2018a). A generic approach to nonparametric function estimation with mixed data. Statistics & Probability Letters, 137:326–330, arXiv:1704.07457

Nagler, T. (2018b). Asymptotic analysis of the jittering kernel density estimator. Mathematical Methods of Statistics, in press, arXiv:1705.05431

Examples

x <- as.factor(rbinom(10, 1, 0.5))
equi_jitter(x)
#>  [1] 1.7000000 1.9000000 0.6428571 0.7857143 0.9285714 1.0714286 2.1000000
#>  [8] 1.2142857 2.3000000 1.3571429