Split data into training and test sets, optionally standardizing by training set centers and standard deviations
split_data(data, test.id = NULL, train.id = NULL, standardize = FALSE)
data frame with rows as samples, columns as features
integer vector of indices for test set. If NULL
(default),
all samples are used.
integer vector of indices for training set. If NULL
(default), all samples are used.
logical; if TRUE
, the training sets are standardized on
features to have mean zero and unit variance. The test sets are
standardized using the vectors of centers and standard deviations used in
corresponding training sets.