-
Notifications
You must be signed in to change notification settings - Fork 3.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Error in best score and best iter, wrong value returned #2295
Comments
ping @Laurae2 |
Using trace(lgb.train,edit=TRUE), I was able to successfully modify part of the lgb.train code so that I could get best_iter and best score based on the first element listed in valids, which seemed to be the intent here. I modified line 272 to 279 of lgb.train with the following:
|
gently ping @Laurae2 |
@jameslamb Based on the changes involved in the PR, I would be inclined to think that it would solve the issue I experie ced. I can't really test the changes in the environment where I experienced the issue unfortunately, it is no longer available to me. I could test it on my personal computer if needed but I assume you already did that so that I don't think there would be much of a point. I'm realIy sorry that I could not be of more help and thank you for your help in solving this issue. |
@marQca no problem at all! Thanks very much for taking the time to report it. You've helped us make |
Environment info
Ubuntu 18.04.2 LTS (GNU/Linux 4.18.0-1024-azure x86_64)
R 3.6.0 lightgbm version 2.2.4
LightGBM version or commit hash:
Error message
There is no error message but the values returned by best_iter and best_score are wrong when the training dataset is part of the valids.
This appears to be related to the May 27 2019 commit to fix best_iter and best_score and those new lines:
if (record && is.na(env$best_score)) { if (env$eval_list[[1]]$higher_better[1] == TRUE) { cv_booster$best_iter <- unname(which.max(unlist(cv_booster$record_evals[[2]][[1]][[1]]))) cv_booster$best_score <- cv_booster$record_evals[[2]][[1]][[1]][[cv_booster$best_iter]] } else { cv_booster$best_iter <- unname(which.min(unlist(cv_booster$record_evals[[2]][[1]][[1]]))) cv_booster$best_score <- cv_booster$record_evals[[2]][[1]][[1]][[cv_booster$best_iter]] } }
# When early stopping is not activated, we compute the best iteration / score ourselves by selecting the first metric and the first dataset if (record && length(valids) > 0 && is.na(env$best_score)) { if (env$eval_list[[1]]$higher_better[1] == TRUE) { booster$best_iter <- unname(which.max(unlist(booster$record_evals[[2]][[1]][[1]]))) booster$best_score <- booster$record_evals[[2]][[1]][[1]][[booster$best_iter]] } else { booster$best_iter <- unname(which.min(unlist(booster$record_evals[[2]][[1]][[1]]))) booster$best_score <- booster$record_evals[[2]][[1]][[1]][[booster$best_iter]] } }
Lightgbm itself behaves correctly as the early stopping parameter is applied correctly as per the documentation on lightgbm.
"Activates early stopping. The model will train until the validation score stops improving. Validation score needs to improve at least every early_stopping_rounds round(s) to continue training. Requires at least one validation data and one metric. If there’s more than one, will check all of them. But the training data is ignored anyway. To check only the first metric set first_metric_only to True."
The training data is ignored for early_stopping, so best_score and best_iter should not apply to the train data.
If we look at xgboost documentation for a comparison we have this:
"early_stopping_rounds (int) – Activates early stopping. Validation metric needs to improve at least once in every early_stopping_rounds round(s) to continue training. Requires at least one item in evals. The method returns the model from the last iteration (not the best one). If there’s more than one item in evals, the last entry will be used for early stopping."
While the early stop implementation in lightgbm uses all valids except the train data unlike xgboost, modifying the code to use the last valids dataset instead of the first one for best_iter and best_score would likely prevent best_iter and best_score being associated with the train data.
Reproducible examples
`#Starting from the lgb.train example
library(lightgbm)
data(agaricus.train, package = "lightgbm")
train <- agaricus.train
dtrain <- lgb.Dataset(train$data, label = train$label)
data(agaricus.test, package = "lightgbm")
test <- agaricus.test
dtest <- lgb.Dataset.create.valid(dtrain, test$data, label = test$label)
params <- list(objective = "regression", metric = "l2")
valids <- list(test = dtest)
model <- lgb.train(params,
dtrain,
100,
valids,
min_data = 1,
learning_rate = 1,
early_stopping_rounds = 10)
#Early stop works correctly and the value returned by best score and best iter is correct.
model$best_iter
model$best_score
#However, if we add the train as part of the valids to see how it improves
#comparatively to the test to try to see if the model is overfitting,
#the best_score and best_iter are based on the train, which makes no sense.
valids2<-list(train=dtrain,test=dtest)
model2<-lgb.train(params,
dtrain,
100,
valids2,
min_data = 1,
learning_rate = 1,
early_stopping_rounds = 10)
model2$best_iter
which.min(unlist(model2$record_evals$test$l2$eval))
which.min(unlist(model2$record_evals$train$l2$eval))
#This might be expected since the fix of May 27 2019 on used the first valid available when env$best_score and env$best_iter was NA
#However, if we try to move the test first, the train always ends up being the first parameter so the best_iter and best_score are always wrong
valids3<-list(test=dtest,train=dtrain)
model3<-lgb.train(params,
dtrain,
100,
valids3,
min_data = 1,
learning_rate = 1,
early_stopping_rounds = 10)
model3$best_iter
which.min(unlist(model3$record_evals$test$l2$eval))
which.min(unlist(model3$record_evals$train$l2$eval))`
The text was updated successfully, but these errors were encountered: