-
Notifications
You must be signed in to change notification settings - Fork 3.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[python] [R-package] Use the same address when updated label/weight/query #2662
Conversation
better fix (todo): use metadata.label/weight in objective and metric. |
… during training.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added a few comments for the R package.
@@ -207,6 +208,12 @@ Booster <- R6::R6Class( | |||
# Perform boosting update iteration | |||
update = function(train_set = NULL, fobj = NULL) { | |||
|
|||
if (is.null(train_set)) { | |||
if (private$train_set$.__enclos_env__$private$version != private$train_set_version) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
given there are two integers, do we still need to use identical
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wasn't sure if they were guaranteed to always be non-null. You can use !=
if they will always be integers.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yeah, if there is a NULL, it means there are bugs in the code.
@@ -55,6 +55,7 @@ Booster <- R6::R6Class( | |||
|
|||
# Create private booster information | |||
private$train_set <- train_set | |||
private$train_set_version <- train_set$.__enclos_env__$private$version |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe a naive question, but what does it mean to have a new "version" of the training set? Is it mainly updating weights as part of boosting? I think I am missing some context, so I don't understand what problem this patch solves.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
currently, Dataset is allowed to update label/weight/... (field data) during training, but the booster cannot capture these changes. And ideally, we should update related classes, such as the cached label/weight/.. in Objective/Metric. So I add a version number to indicate the changes of these fields.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah I see ok, makes sense. Thanks!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM for Python code.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
R side looks good to me
No description provided.