object
__init__
(params=None, train_set=None, model_file=None, model_str=None, silent=False)[source]¶__init__ ([params, train_set, model_file, …]) | Initialize the Booster. |
add_valid (data, name) | Add validation data. |
attr (key) | Get attribute string from the Booster. |
current_iteration () | Get the index of the current iteration. |
dump_model ([num_iteration, start_iteration, …]) | Dump Booster to JSON format. |
eval (data, name[, feval]) | Evaluate for data. |
eval_train ([feval]) | Evaluate for training data. |
eval_valid ([feval]) | Evaluate for validation data. |
feature_importance ([importance_type, iteration]) | Get feature importances. |
feature_name () | Get names of features. |
free_dataset () | Free Booster’s Datasets. |
free_network () | Free Booster’s network. |
get_leaf_output (tree_id, leaf_id) | Get the output of a leaf. |
get_split_value_histogram (feature[, bins, …]) | Get split value histogram for the specified feature. |
lower_bound () | Get lower bound value of a model. |
model_from_string (model_str[, verbose]) | Load Booster from a string. |
model_to_string ([num_iteration, …]) | Save Booster to string. |
num_feature () | Get number of features. |
num_model_per_iteration () | Get number of models per iteration. |
num_trees () | Get number of weak sub-models. |
predict (data[, start_iteration, …]) | Make a prediction. |
refit (data, label[, decay_rate]) | Refit the existing Booster by new data. |
reset_parameter (params) | Reset parameters of Booster. |
rollback_one_iter () | Rollback one iteration. |
save_model (filename[, num_iteration, …]) | Save Booster to file. |
set_attr (**kwargs) | Set attributes to the Booster. |
set_network (machines[, local_listen_port, …]) | Set the network configuration. |
set_train_data_name (name) | Set the name to the training Dataset. |
shuffle_models ([start_iteration, end_iteration]) | Shuffle models. |
trees_to_dataframe () | Parse the fitted model and return in an easy-to-read pandas DataFrame. |
update ([train_set, fobj]) | Update Booster for one iteration. |
upper_bound () | Get upper bound value of a model. |
add_valid
(data, name)[source]¶attr
(key)[source]¶current_iteration
()[source]¶dump_model
(num_iteration=None, start_iteration=0, importance_type='split')[source]¶eval
(data, name, feval=None)[source]¶
- predslist or numpy 1-D array
The predicted values.- eval_dataDataset
The evaluation dataset.- eval_namestring
The name of evaluation function (without whitespaces).- eval_resultfloat
The eval result.- is_higher_betterbool
Is eval result higher better, e.g. AUC isis_higher_better
.
fobj
).For multi-class task, the preds is group by class_id first, then group by row_id.If you want to get i-th row preds in j-th class, the access way is preds[j * num_data + i].eval_train
(feval=None)[source]¶is_higher_better
.fobj
).For multi-class task, the preds is group by class_id first, then group by row_id.If you want to get i-th row preds in j-th class, the access way is preds[j * num_data + i].eval_valid
(feval=None)[source]¶is_higher_better
.fobj
).For multi-class task, the preds is group by class_id first, then group by row_id.If you want to get i-th row preds in j-th class, the access way is preds[j * num_data + i].feature_importance
(importance_type='split', iteration=None)[source]¶feature_name
()[source]¶free_dataset
()[source]¶free_network
()[source]¶get_leaf_output
(tree_id, leaf_id)[source]¶get_split_value_histogram
(feature, bins=None, xgboost_style=False)[source]¶xgboost_style=True
,the number of bins equals number of unique split values.If string, it should be one from the list of the supported values by numpy.histogram()
function.numpy.histogram()
function.If True, the returned value is matrix, in which the first column is the right edges of non-empty binsand the second one is the histogram values.xgboost_style=False
, the values of the histogram of used splitting values for the specified featureand the bin edges.xgboost_style=True
, the histogram of used splitting values for the specified feature.lower_bound
()[source]¶model_from_string
(model_str, verbose=True)[source]¶model_to_string
(num_iteration=None, start_iteration=0, importance_type='split')[source]¶num_feature
()[source]¶num_model_per_iteration
()[source]¶num_trees
()[source]¶predict
(data, start_iteration=0, num_iteration=None, raw_score=False, pred_leaf=False, pred_contrib=False, data_has_header=False, is_reshape=True, **kwargs)[source]¶start_iteration
are used (no limits).If <= 0, all iterations from start_iteration
are used (no limits).pred_contrib
we return a matrix with an extracolumn, where the last column is the expected value.pred_contrib=True
).refit
(data, label, decay_rate=0.9, **kwargs)[source]¶leaf_output=decay_rate*old_leaf_output+(1.0-decay_rate)*new_leaf_output
to refit trees.predict
method.reset_parameter
(params)[source]¶rollback_one_iter
()[source]¶save_model
(filename, num_iteration=None, start_iteration=0, importance_type='split')[source]¶set_attr
(**kwargs)[source]¶set_network
(machines, local_listen_port=12400, listen_time_out=120, num_machines=1)[source]¶set_train_data_name
(name)[source]¶shuffle_models
(start_iteration=0, end_iteration=- 1)[source]¶trees_to_dataframe
()[source]¶update
(train_set=None, fobj=None)[source]¶
- predslist or numpy 1-D array
The predicted values.- train_dataDataset
The training dataset.- gradlist or numpy 1-D array
The value of the first order derivative (gradient) for each sample point.- hesslist or numpy 1-D array
The value of the second order derivative (Hessian) for each sample point.
fobj
).For multi-class task, the preds is group by class_id first, then group by row_id.If you want to get i-th row preds in j-th class, the access way is score[j * num_data + i]and you should group grad and hess in this way as well.upper_bound
()[source]¶