This function estimates parameters for xgboost based on bayesian optimization.
xgb_opt(train_data, train_label, test_data, test_label, objectfun, evalmetric, eta_range = c(0.1, 1L), max_depth_range = c(4L, 6L), nrounds_range = c(70, 160L), subsample_range = c(0.1, 1L), bytree_range = c(0.4, 1L), init_points = 4, n_iter = 10, acq = "ei", kappa = 2.576, eps = 0, optkernel = list(type = "exponential", power = 2), classes = NULL)
| train_data | A data frame for training of xgboost |
|---|---|
| train_label | The column of class to classify in the training data |
| test_data | A data frame for training of xgboost |
| test_label | The column of class to classify in the test data |
| objectfun | Specify the learning task and the corresponding learning objective
|
| evalmetric | evaluation metrics for validation data. Users can pass a self-defined function to it. Default: metric will be assigned according to objective(rmse for regression, and error for classification, mean average precision for ranking).
|
| eta_range | The range of eta |
| max_depth_range | The range of max_depth |
| nrounds_range | The range of nrounds |
| subsample_range | The range of subsample rate |
| bytree_range | The range of colsample_bytree rate |
| init_points | Number of randomly chosen points to sample the target function before Bayesian Optimization fitting the Gaussian Process. |
| n_iter | Total number of times the Bayesian Optimization is to repeated. |
| acq | Acquisition function type to be used. Can be "ucb", "ei" or "poi".
|
| kappa | tunable parameter kappa of GP Upper Confidence Bound, to balance exploitation against exploration, increasing kappa will make the optimized hyperparameters pursuing exploration. |
| eps | tunable parameter epsilon of Expected Improvement and Probability of Improvement, to balance exploitation against exploration, increasing epsilon will make the optimized hyperparameters are more spread out across the whole range. |
| optkernel | Kernel (aka correlation function) for the underlying Gaussian Process. This parameter should be a list that specifies the type of correlation function along with the smoothness parameter. Popular choices are square exponential (default) or matern 5/2 |
| classes | set the number of classes. To use only with multiclass objectives. |
The test accuracy and a list of Bayesian Optimization result is returned:
Best_Par a named vector of the best hyperparameter set found
Best_Value the value of metrics achieved by the best hyperparameter set
History a data.table of the bayesian optimization history
Pred a data.table with validation/cross-validation prediction for each round of bayesian optimization history
# NOT RUN { library(MlBayesOpt) set.seed(71) res0 <- xgb_opt(train_data = fashion_train, train_label = y, test_data = fashion_test, test_label = y, objectfun = "multi:softmax", evalmetric = "merror", classes = 10, init_points = 3, n_iter = 1) # }