Performs k-fold cross validation for a jdify object.

cv_jdify(formula, data, jd_method = "cctools", folds = 10, cores = 1, ...)

Arguments

formula

an object of class "formula"; same as stats::lm().

data

matrix, data frame, list or environment (or object coercible by base::as.data.frame()) containing the variables in the model.

jd_method

an object of class "jd_method" defining the method for joint density estimation, see jd_method().

folds

number of folds.

cores

number of cores for parallelized cross validation (based on foreach::foreach()).

...

further arguments passed to fit_fun().

Value

A list with elements

  • folds1``, ...,foldsk: for each fold: the fitted model$fit, estimated conditional probabilities ($probs), and indexes for training and test data ($train_index,$test_index`).

  • cv_probs: aggragated out-of-sample probs in same order as original data.

References

Nagler, T. (2017). A generic approach to nonparametric function estimation with mixed data. arXiv:1704.07457

Examples

# simulate training and test data dat <- data.frame( cl = as.factor(rbinom(100, 1, 0.5)), x1 = rnorm(100), x2 = ordered(rbinom(10, 1, 0.3), 0:1) ) cv <- cv_jdify(cl ~ x1 + x2, dat) probs <- cv$cv_probs assess_clsfyr(probs[, 1], dat[, 1] == 0, measure = c("ACC", "F1"))
#> threshold measure value #> 1 0.0 ACC 0.60000000 #> 2 0.1 ACC 0.59000000 #> 3 0.2 ACC 0.59000000 #> 4 0.3 ACC 0.56000000 #> 5 0.4 ACC 0.55000000 #> 6 0.5 ACC 0.54000000 #> 7 0.6 ACC 0.51000000 #> 8 0.7 ACC 0.42000000 #> 9 0.8 ACC 0.41000000 #> 10 0.9 ACC 0.40000000 #> 11 1.0 ACC 0.38000000 #> 12 0.0 F1 0.75000000 #> 13 0.1 F1 0.74213836 #> 14 0.2 F1 0.74213836 #> 15 0.3 F1 0.71794872 #> 16 0.4 F1 0.70967742 #> 17 0.5 F1 0.69333333 #> 18 0.6 F1 0.62595420 #> 19 0.7 F1 0.35555556 #> 20 0.8 F1 0.09230769 #> 21 0.9 F1 0.06250000 #> 22 1.0 F1 0.00000000