Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[R-package] sorting of observation weights and initial scores is broken in lgb.cv() #2572

Closed
jameslamb opened this issue Nov 16, 2019 · 3 comments
Assignees

Comments

@jameslamb
Copy link
Collaborator

In #2503 , it was reported that using certain sampling strategies could cause incorrect results if the indices for subsetting weren't sorted. This was "fixed" in the R package by #2524 , however that PR introduced a bug noted in #2524 (comment).

Essentially, slice.lgb.Dataset() takes in a vector of indices (row numbers) and returns a new Dataset with that subset. On its way to doing that, it sorts those indices. If a model had observation weights or initial scores, they don't get sorted the same way, meaning they'll refer to the wrong observations.

To fix this, need to ensure that tuples of indices, weights, and initial scores are all kept together when re-sorting indices.

@rgranvil
Copy link
Contributor

@jameslamb Is this still being worked on? Anything I can do to help? I'm anxious to get this fixed :)

@jameslamb
Copy link
Collaborator Author

@rgranvil yes it's still being worked on, sorry for the delay! I hope to have this fixed w ithin the next day

jameslamb added a commit to jameslamb/LightGBM that referenced this issue Dec 30, 2019
@jameslamb
Copy link
Collaborator Author

@jameslamb Is this still being worked on? Anything I can do to help? I'm anxious to get this fixed :)

Hey @rgranvil , I think I have this working! It will be a bit of time until reviewers are able to give feedback on my latest revisions and merge them, but if you're ok with living on an unmerged version of the R package then you can try the fix I just pushed to #2573 .

jameslamb added a commit to jameslamb/LightGBM that referenced this issue Jan 12, 2020
jameslamb added a commit to jameslamb/LightGBM that referenced this issue Jan 12, 2020
jameslamb added a commit to jameslamb/LightGBM that referenced this issue Jan 13, 2020
@lock lock bot locked as resolved and limited conversation to collaborators Mar 17, 2020
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests

2 participants