Applies the continuous convolution trick, i.e. adding continuous noise to all discrete variables. If a variable should be treated as discrete, declare it as ordered() (passed to expand_as_numeric()).

cont_conv(x, theta = 0, nu = 5, quasi = TRUE)

Arguments

x

data; numeric matrix or data frame.

theta

scale parameter of the USB distribution (see, dusb()).

nu

smoothness parameter of the USB distribution (see, dusb()). The estimator uses the Epanechnikov kernel for smoothing and the USB for continuous convolution (default parameters correspond to the \(U[-0.5, 0.5]\) distribution).

quasi

logical indicating whether quasi random numbers sholuld be used (qrng::ghalton()); only works for theta = 0.

Value

A data frame with noise added to each discrete variable (ordered columns).

Details

The UPSB distribution (dusb()) is used as the noise distribution. Discrete variables are assumed to be integer-valued.

References

Nagler, T. (2017). A generic approach to nonparametric function estimation with mixed data. arXiv:1704.07457

Examples

# dummy data with discrete variables dat <- data.frame( F1 = factor(rbinom(10, 4, 0.1), 0:4), Z1 = ordered(rbinom(10, 5, 0.5), 0:5), Z2 = ordered(rpois(10, 1), 0:10), X1 = rnorm(10), X2 = rexp(10) ) pairs(dat)
pairs(expand_as_numeric(dat)) # expanded variables without noise
pairs(cont_conv(dat)) # continuously convoluted data