The .632 estimator for the log loss error rate is calculated for a given classifier. The .632+ estimator is an extension that reduces overfitting and is run by default.
error_632(data, class, algorithm, pred, test.id, train.id, plus = TRUE)
data frame with rows as samples, columns as features
true/reference class vector used for supervised learning
character string for classifier. See splendid
for possible
options.
vector of OOB predictions using the same classifier as
algorithm
.
vector of test set indices for each bootstrap replicate
vector of training set indices for each bootstrap replicate
logical; if TRUE
(default), the .632+ estimator is calculated.
Otherwise, the .632 estimator is calculated.
the .632(+) log loss error rate
This function is intended to be used internally by splendid_model
.
Friedman, Jerome, Trevor Hastie, and Robert Tibshirani. The elements of statistical learning. Vol. 1. New York: Springer series in statistics, 2001.
Efron, Bradley and Tibshirani, Robert (1997), "Improvements on Cross-Validation: The .632+ Bootstrap Method," Journal of American Statistical Association, 92, 438, 548-560.
if (FALSE) {
data(hgsc)
class <- as.factor(attr(hgsc, "class.true"))
set.seed(1)
train.id <- boot_train(data = hgsc, class = class, n = 5)
test.id <- boot_test(train.id = train.id)
mod <- purrr::map(train.id, ~ classification(hgsc[., ], class[.], "xgboost"))
pred <- purrr::pmap(list(mod = mod, test.id = test.id, train.id = train.id),
prediction, data = hgsc, class = class)
error_632(hgsc, class, "xgboost", pred, test.id, train.id, plus = FALSE)
error_632(hgsc, class, "xgboost", pred, test.id, train.id, plus = TRUE)
}