Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

create buffer for gradients and hessians with goss and customized objective (fixes #3243) #3263

Merged
merged 3 commits into from
Aug 5, 2020

Conversation

shiyu1994
Copy link
Collaborator

This is to fix Issue #3243.
The gradients_ and hessians_ of GBDT object is of size 0 if customized objective function is used.

if (objective_function_ != nullptr) {
size_t total_size = static_cast<size_t>(num_data_) * num_tree_per_iteration_;
gradients_.resize(total_size);
hessians_.resize(total_size);
}

However, they are used by the bagging function of GOSS, for example

for (int cur_tree_id = 0; cur_tree_id < num_tree_per_iteration_; ++cur_tree_id) {
size_t idx = static_cast<size_t>(cur_tree_id) * num_data_ + cur_idx;
grad += std::fabs(gradients_[idx] * hessians_[idx]);
}

which causes the segment fault.

@guolinke
Copy link
Collaborator

Hi @shiyu1994 , can we only fix this in goss.hpp. It seems the normal GBDT is not affected.

@shiyu1994
Copy link
Collaborator Author

The code of normal GBDT is only modified to include two more parameters gradients and hessians for Bagging method and this won't affect its execution. These two parameters are only effective when GOSS is used. With GOSS we are supporting a bagging according to gradients and hessians. So I think including them in the interface of Bagging method should be meaningful, which indicates that our bagging process can use gradient information.

The original Bagging and BaggingHelper do not provide parameters for gradients and hessians. With customized objective, these two arrays will be passed through the arguments of TrainOneIter from the outside of GBDT object. In that case, I think expanding the parameters to include 'gradients' and 'hessians' should be necessary. Otherwise GOSS code will have no access to the arrays passed from outside the GBDT object to TrainOneIter (inherited by GOSS without modification) which further calls the Bagging method.

Is there a better way to do this or any other suggestion?

@guolinke
Copy link
Collaborator

I was thinking to override the TrainOneIter in GOSS.
in the GOSS:TrainOneIter, we can firstly copy the external grad/hess to grad_/hess_, then call GBDT::TrainOneIter(grad_, hess_).

@shiyu1994
Copy link
Collaborator Author

Ok. That seems more elegant. I'll modify this.

src/boosting/goss.hpp Outdated Show resolved Hide resolved
@guolinke guolinke merged commit 1dbe5e9 into microsoft:master Aug 5, 2020
@StrikerRUS
Copy link
Collaborator

Compilation of current master fails for Windows with the following error:

d:\a\1\s\src\boosting\goss.hpp(62): error C3016: 'i': index variable in OpenMP 'for' statement must have signed integral type (compiling source file D:\a\1\s\src\boosting\boosting.cpp) [D:\a\1\s\build\lightgbm.vcxproj]

I believe it was introduced here.

BTW, it is quite strange that CI allowed to merge this PR.

@guolinke
Copy link
Collaborator

guolinke commented Aug 6, 2020

@StrikerRUS sorry, it is my bad!
I fix this directly in 9b26373

@StrikerRUS StrikerRUS mentioned this pull request Aug 6, 2020
@StrikerRUS
Copy link
Collaborator

Ah, OK! Thanks a lot for the prompt fix!

@github-actions
Copy link

This pull request has been automatically locked since there has not been any recent activity since it was closed. To start a new related discussion, open a new issue at https://github.com/microsoft/LightGBM/issues including a reference to this.

@github-actions github-actions bot locked as resolved and limited conversation to collaborators Aug 24, 2023
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants