Calculates performance measures for a calssifier Assumes there are two
classes, and the first of level(true_cl) is to be predicted (the
"positive").
assess_clsfyr(score, true_cls, measure = "ACC", threshold = seq(0, 1, by = 0.1))
| score | probabilities or scores for the target class 1 ("positive"); scores are assumed to be in \([0, 1]\) and high scores correspond to high probability. |
|---|---|
| true_cls | vector of indicators for the target class: |
| measure | a character vector of performance measures to be calculated, see Details. |
| threshold | threshold for prediction, see |
A data.frame where each column corresponds to one value of
threshold. The corresponding values can be found with attr(result, "threshold").
Valid options for measure are
"TP": number of true positives,
"FP": number of false positive,
"TN": number of true negatives,
"FN": number of false negatives,
"TPR", "sensitivity", "recall": true positive rate (\(TP / P\)),
"FPR", "fall-out": false positive rate (\(FP / N\)),
"TNR", "specificity": true negative rate (\(TN / N\)),
"FNR": false negative rate (\(FN / P\)),
"PRC", "PPV": precision/positive predictive value (\(TP / (TP + FP)\),
"NPV": negative predictive value (\(TN / (TN + FN)\)),
"FDR": false discovery rate (\(FP / (TP + FP)\)),
"ACC", "accuracy": accuracy (\((TP + TN) / (P + N)\)),
"F1": F1 score (\(2 * TP / (2 * TP + FP + FN)\)),
"MCC": Matthews correlation coefficient
$$\frac{(TP * TN - FP * FN)}{[(TP + FP) * (TP + FN) * (TN + FP) *
(TN + FN)]^(-1/2)},$$
"informedness": informedness (\(TP / P + TN / N - 1\)),
"markedness": markedness (\(TP / (TP + FP) + TN / (TN + FN) - 1\)),
"AUC": area under the curve (must be in first position)
where P and N are the number of positives and negatives, respectively.
# simulate training and test data dat <- data.frame( cl = as.factor(rbinom(10, 1, 0.5)), x1 = rnorm(10), x2 = rbinom(10, 1, 0.3) ) model <- jdify(cl ~ x1 + x2, data = dat) # joint density fit probs <- predict(model, dat, what = "probs") # conditional probabilities # calculate performance measures assess_clsfyr(probs[, 1], dat[, 1] == 0, measure = c("ACC", "F1"))#> threshold measure value #> 1 0.0 ACC 0.6000000 #> 2 0.1 ACC 0.6000000 #> 3 0.2 ACC 0.6000000 #> 4 0.3 ACC 0.6000000 #> 5 0.4 ACC 0.6000000 #> 6 0.5 ACC 0.8000000 #> 7 0.6 ACC 0.6000000 #> 8 0.7 ACC 0.5000000 #> 9 0.8 ACC 0.4000000 #> 10 0.9 ACC 0.4000000 #> 11 1.0 ACC 0.4000000 #> 12 0.0 F1 0.7500000 #> 13 0.1 F1 0.7500000 #> 14 0.2 F1 0.7500000 #> 15 0.3 F1 0.7500000 #> 16 0.4 F1 0.7500000 #> 17 0.5 F1 0.8571429 #> 18 0.6 F1 0.6000000 #> 19 0.7 F1 0.2857143 #> 20 0.8 F1 0.0000000 #> 21 0.9 F1 0.0000000 #> 22 1.0 F1 0.0000000# calculate area under the curve FPR <- assess_clsfyr(probs[, 1], dat[, 1] == 0, measure = c("FPR"))$value TPR <- assess_clsfyr(probs[, 1], dat[, 1] == 0, measure = c("TPR"))$value get_auc(data.frame(FPR = FPR, TPR = TPR))#> [,1] #> [1,] 0.75